Sample SPARQL Queries

Queries on this page are "tutorial queries" which are simple queries with documentation explaining query syntax, purpose, and returned data. If you're new to SPARQL, begin with the sample queries here to get a handle of the syntax. If you're experienced with SPARQL, consider skipping to the advanced queries to interrogate the service with relevant queries.

In this page


Introductory query

Find the subjects and objects of a given predict, such as "ncit:P97" (DEFINITION)

# Find the subjects and objects of a given predict, such as "ncit:P97" (DEFINITION)

PREFIX ncit: <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#>
SELECT ?s ?o
FROM  <http://ncicb.nci.nih.gov/xml/owl/EVS/ThesaurusInf.rdf>
WHERE { ?s ncit:P97 ?o . }
ORDER BY ?s LIMIT 100
  • prefix declaration(s) to abbreviate URIs/IRIs.
  • select identifies what information to return from the query.
  • from states the RDF graph(s) (datasets) to query.
  • Following where, the triple pattern(s) to match.
    • In this case only one triple in the graph pattern, where the subject ?s and the object ?o in the subject‑predicate‑object triple are variables (indicated by the ?), and the predicate is the (abbreviated) IRI ncit:P97.
  • The . (dot) indicates the end of each triple in a graph query pattern. The , ; (comma and semicolon) can also be used to terminate a line depending on the sequence of triples in the query (see below).
  • limit and order by modify the query results.

Expected Results: 100 rows. Here are the first 5 rows.

row num s o
1 ncit:A1 An association that specifies the parent of the branch encompassing a role's domain.
2 ncit:A10 An association created to allow the source CDRH to assign a parent to each concept with the intent of creating a hierarchy that includes only terms in which they are the contributing source.
3 ncit:A11 An association created to allow the source NICHD to assign a parent to each concept with the intent of creating a hierarchy that includes only terms in which they are a contributing source.
4 ncit:A12 An association created to relate a data element concept to the codelist term that is used to bundle its subset of valid value concepts.
5 ncit:A13 An association that indicates that a finding or lab test is related to a gene, possibly through a variant or product.

Note that the graph name is arbitrary; it is given to the dataset on load to the quadstore. The graph names that we use in this service are shown in the Documentation page.

A number of triplestores are configured with some of the most common prefixes so you don't have to specify them in your query (see prefixes in Documentation). However, we suggest that you include prefixes. Especially if you contact us to help us troubleshoot queries (or you use an external service, see for example the SPARQL query validatorWeb Site Linking Policy).


Subclasses of Cancer Gene in NCIt

Find all subclasses (children) of a given class, such as "ncit:C19540" (Cancer Gene), and return the subclasses with their associated relations and objects

# Find all subclasses (children) of a given class, such as "ncit:C19540" (Cancer Gene), and return the subclasses with their associated relations and objects

prefix ncit:  <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#>
prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#>
prefix owl:  <http://www.w3.org/2002/07/owl#>
select ?sub_label ?rel_label ?obj_label
from <http://ncicb.nci.nih.gov/xml/owl/EVS/ThesaurusInf.rdf>
where {
  VALUES ?superclass { ncit:C19540 }
  ?subject rdfs:subClassOf+ ?superclass ;
           rdfs:label ?sub_label .
  OPTIONAL {
      ?subject ?relation ?object . 
      ?relation a owl:ObjectProperty ; 
                rdfs:label ?rel_label . 
      ?object rdfs:label ?obj_label .
  }
}
order by asc(?sub_label)  limit 500

This query targets the flat RDF representation of the NCIt.

  • Using values makes it easy to inject variables.
  • The + following the rdfs:subClassOf predicate indicates a property path.Web Site Linking Policy
  • The ; (semicolon) terminates a triple in the query pattern and indicates that the next triple in the pattern has the same subject, hence doesn't need to be specified.
    • Use of a , (comma) to terminate the triple would indicate that the next triple in the query has the same subject and predicate.
  • Use optional to include subjects in the resultset that may lack those properties. Consider it an "outer join."
  • a is shorthand for rdf:type, i.e. "?relation rdf:type owl:ObjectProperty", the a keyword is case-sensitive.

Expected Results: 500 rows. Here are the first 5 rows.

row num sub_label rel_label obj_label
1 ABCB1 2 Allele Gene_Plays_Role_In_Process ATP Hydrolysis
2 ABCB1 2 Allele Gene_Plays_Role_In_Process Transmembrane Transport
3 ABCB1 2 Allele Gene_Plays_Role_In_Process Ligand Binding
4 ABCB1 2 Allele Gene_Plays_Role_In_Process Transport Process
5 ABCB1 2 Allele Gene_Found_In_Organism Human

The property path in the graph query pattern would need to be (rdfs:subClassOf|(owl:equivalentClass/owl:intersectionOf/rdf:rest*/rdf:first) )+ to work both for the flat RDF and the OWL DL representation (assuming only class restrictions appear in the owl:equivalentClass expression). The graph patterns in the optional block would also need to be modified to traverse the OWL class restrictions involved in the relations.


Annotations and annotated property values of "Heart Failure" in NCIt

Find all annotations and annotated property values of a given class, such as "ncit:C50577" (Heart Failure)

# Find all annotations and annotated property values of a given class, such as "ncit:C50577" (Heart Failure)

prefix ncit: <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix owl: <http://www.w3.org/2002/07/owl#>

select DISTINCT ?subject ?prop_label ?prop_value ?annotations
from <http://ncicb.nci.nih.gov/xml/owl/EVS/ThesaurusInf.rdf>
where {
  values ?subject { ncit:C50577 }
  ?subject ?prop ?prop_value .
  ?prop rdfs:label ?prop_label ;
    a owl:AnnotationProperty .

  optional {
    SELECT DISTINCT ?subject ?prop ?prop_value 
           (GROUP_CONCAT(?annot ; separator=' | ') AS ?annotations)
    where {
      ?an a owl:Axiom ;
         owl:annotatedSource ?subject ;
         owl:annotatedProperty ?prop ;
         owl:annotatedTarget ?prop_value ;
            ?annotation ?annot_value .
      ?annotation rdfs:label ?annotation_label .
      BIND( CONCAT( ?annotation_label , '=' , ?annot_value ) AS ?annot )
      } GROUP BY ?subject ?an ?prop ?prop_value
   }
}
order by ?prop_label
limit 100
  • select sub-queries, in this case in an optional block.
  • distinct, a solution modifier to eliminate duplicates.

  • group_concat to aggregate multiple properties/values into a single element in the resultset.
  • bind binds a variable to an expression, in this case ?annot to the result of concat.
  • as, binding keyword; in the bind and select clause.
  • concat to combine string variables and literals.
  • group by partitions the solution into different groups prior to an aggregation.

Expected Results: 41 rows. Here are the first 5 rows.

row num subject prop_label prop_value annotations
1 ncit:C50577 ALT_DEFINITION A disorder characterized by the inability of the heart to pump blood at an adequate volume to meet tissue metabolic requirements, or, the ability to do so only at an elevation in the filling pressure. Definition Source=CTCAE
2 ncit:C50577 ALT_DEFINITION Inability of the heart to meet tissue metabolic requirements. Definition Source=NICHD
3 ncit:C50577 ALT_DEFINITION A clinical condition in which the function of the heart is inadequate to meet the metabolic needs of the body. Definition Source=ACC/AHA
4 ncit:C50577 Concept_In_Subset ncit:C191671
5 ncit:C50577 Concept_In_Subset ncit:C191396

The owl:Axioms happen to be blank nodes in the graph. The ?an represents the blank node in this query pattern and in similar queries the ?an could have been substituted by empty brackets like so []; however in this query we needed to use the ?an variable so we could refer to it from the group by expression. Below is yet another representation for blank nodes.


Find data elements where the main concept of the object class has a label containing a specific string such as "Person"

# Find data elements where the main concept of the object class has a label containing a specific string such as "Person"

prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix dr: <http://cbiit.nci.nih.gov/caDSR#>
prefix idr: <http://www.iso.org/11179/MDR#>
prefix ncit: <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#>

select distinct ?concept_label 
       (group_concat(?data_element ; separator=', ') as ?data_elements) 
       (COUNT(?data_element) as ?element_count)
FROM <http://cbiit.nci.nih.gov/caDSR>
FROM <http://ncicb.nci.nih.gov/xml/owl/EVS/ThesaurusInf.rdf>
where {
 ?data_element idr:Object_Class [ dr:main_concept ?concept ] .
 ?concept rdfs:label ?concept_label .
 ?concept_label bif:contains "person" .
 FILTER NOT EXISTS { ?concept rdfs:label "Person" }
}
order by desc(?element_count) limit 10
  • count function to count bound variables in the resultset.
  • Multiple froms to specify multiple datasets (graph names).
  • The [ ... ] brackets indicate a blank node as the object of the first triple, where the query pattern to match the blank node is within the brackets. This would be equivalent to
      ?data_element idr:Object_Class ?an . 
      ?an dr:main_concept ?concept .
    
    What you choose depends on referenceability and readability.
  • The bif:contains predicate is a built-in function in this quadstore for indexed-text searches.
  • filter constrains the solutions of a pattern match; it accepts expressions and various functions returning booleans.
  • not exists tests for the existence of a query pattern in a solution. In this case if ?concept has a label of "Person" then this ?concept is excluded from the resultset.

Expected Results: 5 rows.

row num concept_label data_elements element_count
1 Contact Person http://cbiit.nci.nih.gov/caDSR#DE3175768, http://cbiit.nci.nih.gov/caDSR#DE3175769, http://cbiit.nci.nih.gov/caDSR#DE3175770, http://cbiit.nci.nih.gov/caDSR#DE3175771, http://cbiit.nci.nih.gov/caDSR#DE3175772, http://cbiit.nci.nih.gov/caDSR#DE3175773, http://cbiit.nci.nih.gov/caDSR#DE3760786, http://cbiit.nci.nih.gov/caDSR#DE2761955, http://cbiit.nci.nih.gov/caDSR#DE2838201, http://cbiit.nci.nih.gov/caDSR#DE7572752, http://cbiit.nci.nih.gov/caDSR#DE7574307, http://cbiit.nci.nih.gov/caDSR#DE7574308, http://cbiit.nci.nih.gov/caDSR#DE4396201, http://cbiit.nci.nih.gov/caDSR#DE4396205, http://cbiit.nci.nih.gov/caDSR#DE4396209, http://cbiit.nci.nih.gov/caDSR#DE4396293, http://cbiit.nci.nih.gov/caDSR#DE4396313, http://cbiit.nci.nih.gov/caDSR#DE4396416, http://cbiit.nci.nih.gov/caDSR#DE4396419, http://cbiit.nci.nih.gov/caDSR#DE4396422, http://cbiit.nci.nih.gov/caDSR#DE2238019, http://cbiit.nci.nih.gov/caDSR#DE2238020, http://cbiit.nci.nih.gov/caDSR#DE2238021, http://cbiit.nci.nih.gov/caDSR#DE2238022, http://cbiit.nci.nih.gov/caDSR#DE2238023, http://cbiit.nci.nih.gov/caDSR#DE2238024, http://cbiit.nci.nih.gov/caDSR#DE3977134, http://cbiit.nci.nih.gov/caDSR#DE3977211, http://cbiit.nci.nih.gov/caDSR#DE2004365, http://cbiit.nci.nih.gov/caDSR#DE2004366, http://cbiit.nci.nih.gov/caDSR#DE2004367, http://cbiit.nci.nih.gov/caDSR#DE2004368, http://cbiit.nci.nih.gov/caDSR#DE2189053, http://cbiit.nci.nih.gov/caDSR#DE2416400, http://cbiit.nci.nih.gov/caDSR#DE2416402, http://cbiit.nci.nih.gov/caDSR#DE2597588, http://cbiit.nci.nih.gov/caDSR#DE2597589, http://cbiit.nci.nih.gov/caDSR#DE2597590, http://cbiit.nci.nih.gov/caDSR#DE2597591, http://cbiit.nci.nih.gov/caDSR#DE2739352, http://cbiit.nci.nih.gov/caDSR#DE2916872, http://cbiit.nci.nih.gov/caDSR#DE14461880, http://cbiit.nci.nih.gov/caDSR#DE2194308, http://cbiit.nci.nih.gov/caDSR#DE2407060, http://cbiit.nci.nih.gov/caDSR#DE3377592, http://cbiit.nci.nih.gov/caDSR#DE8122898, http://cbiit.nci.nih.gov/caDSR#DE5611061, http://cbiit.nci.nih.gov/caDSR#DE2201174, http://cbiit.nci.nih.gov/caDSR#DE2613013, http://cbiit.nci.nih.gov/caDSR#DE2613014, http://cbiit.nci.nih.gov/caDSR#DE2613015, http://cbiit.nci.nih.gov/caDSR#DE2727234, http://cbiit.nci.nih.gov/caDSR#DE2727235, http://cbiit.nci.nih.gov/caDSR#DE2748014, http://cbiit.nci.nih.gov/caDSR#DE2748089, http://cbiit.nci.nih.gov/caDSR#DE2748090, http://cbiit.nci.nih.gov/caDSR#DE2748091, http://cbiit.nci.nih.gov/caDSR#DE2771335, http://cbiit.nci.nih.gov/caDSR#DE2771336, http://cbiit.nci.nih.gov/caDSR#DE2771337, http://cbiit.nci.nih.gov/caDSR#DE2869690, http://cbiit.nci.nih.gov/caDSR#DE2869691, http://cbiit.nci.nih.gov/caDSR#DE2869694, http://cbiit.nci.nih.gov/caDSR#DE2869913, http://cbiit.nci.nih.gov/caDSR#DE2870468, http://cbiit.nci.nih.gov/caDSR#DE2870469, http://cbiit.nci.nih.gov/caDSR#DE2898748, http://cbiit.nci.nih.gov/caDSR#DE2898749, http://cbiit.nci.nih.gov/caDSR#DE2898750, http://cbiit.nci.nih.gov/caDSR#DE2915809, http://cbiit.nci.nih.gov/caDSR#DE2915810, http://cbiit.nci.nih.gov/caDSR#DE2915811, http://cbiit.nci.nih.gov/caDSR#DE2915812, http://cbiit.nci.nih.gov/caDSR#DE2927726, http://cbiit.nci.nih.gov/caDSR#DE2953675, http://cbiit.nci.nih.gov/caDSR#DE2959877, http://cbiit.nci.nih.gov/caDSR#DE2959927, http://cbiit.nci.nih.gov/caDSR#DE2959928, http://cbiit.nci.nih.gov/caDSR#DE2959929, http://cbiit.nci.nih.gov/caDSR#DE2959930, http://cbiit.nci.nih.gov/caDSR#DE2960016, http://cbiit.nci.nih.gov/caDSR#DE2960023, http://cbiit.nci.nih.gov/caDSR#DE2960024, http://cbiit.nci.nih.gov/caDSR#DE2960025, http://cbiit.nci.nih.gov/caDSR#DE2960026, http://cbiit.nci.nih.gov/caDSR#DE2960027, http://cbiit.nci.nih.gov/caDSR#DE2960028, http://cbiit.nci.nih.gov/caDSR#DE2960029, http://cbiit.nci.nih.gov/caDSR#DE2960030, http://cbiit.nci.nih.gov/caDSR#DE2960031, http://cbiit.nci.nih.gov/caDSR#DE2960032, http://cbiit.nci.nih.gov/caDSR#DE2960033, http://cbiit.nci.nih.gov/caDSR#DE2960034, http://cbiit.nci.nih.gov/caDSR#DE2960035, http://cbiit.nci.nih.gov/caDSR#DE2960036, http://cbiit.nci.nih.gov/caDSR#DE2960037, http://cbiit.nci.nih.gov/caDSR#DE2960038, http://cbiit.nci.nih.gov/caDSR#DE2960047, http://cbiit.nci.nih.gov/caDSR#DE2960048, http://cbiit.nci.nih.gov/caDSR#DE2960049, http://cbiit.nci.nih.gov/caDSR#DE2998203, http://cbiit.nci.nih.gov/caDSR#DE2998204, http://cbiit.nci.nih.gov/caDSR#DE2998205 "103"^^<http://www.w3.org/2001/XMLSchema#integer>
2 Person Contact Information http://cbiit.nci.nih.gov/caDSR#DE2224377, http://cbiit.nci.nih.gov/caDSR#DE2224378, http://cbiit.nci.nih.gov/caDSR#DE2224379, http://cbiit.nci.nih.gov/caDSR#DE2224382, http://cbiit.nci.nih.gov/caDSR#DE2224383, http://cbiit.nci.nih.gov/caDSR#DE2224384, http://cbiit.nci.nih.gov/caDSR#DE2224386, http://cbiit.nci.nih.gov/caDSR#DE2423137, http://cbiit.nci.nih.gov/caDSR#DE2423138, http://cbiit.nci.nih.gov/caDSR#DE2423139, http://cbiit.nci.nih.gov/caDSR#DE2423140, http://cbiit.nci.nih.gov/caDSR#DE2438184, http://cbiit.nci.nih.gov/caDSR#DE2438185, http://cbiit.nci.nih.gov/caDSR#DE2438186 "14"^^<http://www.w3.org/2001/XMLSchema#integer>
3 Responsible Person http://cbiit.nci.nih.gov/caDSR#DE7051563, http://cbiit.nci.nih.gov/caDSR#DE2181803, http://cbiit.nci.nih.gov/caDSR#DE5242015, http://cbiit.nci.nih.gov/caDSR#DE5242022, http://cbiit.nci.nih.gov/caDSR#DE5334617, http://cbiit.nci.nih.gov/caDSR#DE2177, http://cbiit.nci.nih.gov/caDSR#DE2006163, http://cbiit.nci.nih.gov/caDSR#DE2429645, http://cbiit.nci.nih.gov/caDSR#DE2430250, http://cbiit.nci.nih.gov/caDSR#DE2452692, http://cbiit.nci.nih.gov/caDSR#DE3171577 "11"^^<http://www.w3.org/2001/XMLSchema#integer>
4 Person Name http://cbiit.nci.nih.gov/caDSR#DE2483567, http://cbiit.nci.nih.gov/caDSR#DE2726423, http://cbiit.nci.nih.gov/caDSR#DE2726424 "3"^^<http://www.w3.org/2001/XMLSchema#integer>
5 Minor Person http://cbiit.nci.nih.gov/caDSR#DE6662560, http://cbiit.nci.nih.gov/caDSR#DE7426184 "2"^^<http://www.w3.org/2001/XMLSchema#integer>

The filter is mostly for illustration, matches to the literal "Person" would generally not be excluded by a user.

The SPARQL spec does not include indexed text searching. Different triplestores support this with different keywords and syntax. This makes such queries non-portable & you must edit queries if you switch back-ends. An alternative is to use filter regex (...) but this can be (much) slower in comparison.


Is a class treed under "Gene"

Check if a class, such as "ncit:C18145" (Informatics), under class "Gene"

# Check if a class, such as "ncit:C18145" (Informatics), under class "Gene"

prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix ncit: <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#>
ASK {
  GRAPH <http://ncicb.nci.nih.gov/xml/owl/EVS/ThesaurusInf.rdf> {
      ncit:C18145 rdfs:subClassOf+ ?super .
      ?super rdfs:label "Gene" .
  }
}
  • ask returns a boolean true if the query has a solution.
  • graph denotes the graph name (dataset) to be queried.

Expected Result: False


Describe the NCIt class C18369

Describe a given resource, such as "ncit:C18369". It returns triples that either subject or object is "ncit:C18369".

# Describe a given resource, such as "ncit:C18369". It returns triples that either subject or object is "ncit:C18369".

prefix ncit: <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#>
DESCRIBE ncit:C18369
from <http://ncicb.nci.nih.gov/xml/owl/EVS/ThesaurusInf.rdf>
  • describe is used to describe resources. The SPARQL specWeb Site Linking Policy does not specify exactly what should be returned. This service will return the triples where the entity is a subject as well as the triples where it's an object, including blank nodes if part of the solution graph.

Expected Results: 24 rows. Here are the first 5 rows.

row num subject predict object
1 ncit:C18369 ncit:P97 "Oncogene TIM encodes a predicted 60 kD protein containing a DBL homology domain, shared by several signal transducing regulators of small GTP-binding proteins. TIM is thought to control cytoskeletal organization through regulation of small GTP-binding proteins. The human gene is located at 7q33-q35."^^<http://www.w3.org/2001/XMLSchema#string>
2 ncit:C18369 ncit:NHC0 "C18369"^^<http://www.w3.org/2001/XMLSchema#string>
3 ncit:C18369 ncit:R155 ncit:C13625
4 b0_b113678142 <http://www.w3.org/2002/07/owl#annotatedSource> ncit:C18369
5 b0_b113678144 <http://www.w3.org/2002/07/owl#annotatedSource> ncit:C18369