RCSB PDB Help
Search and Browse > Advanced Search
Attribute Search
Overview
What are Attributes?
Attributes are properties of a 3D structure that have specific text or numerical values, that can be used to identify one structure or a group of structures for exploration and analysis. The attributes available for searching include:
- information about the entry, e.g., who solved the structure, when, and by what method; where the structure was published, the names and types of molecules present in the structure, experimental details, and annotations
- properties of small molecules, ligands, drugs, and polymer building blocks (or residues) such as amino acids, and nucleotides
- information about the experiment performed
Why use Attribute Search?
The Attribute Search on RCSB.org allows searching in specific attributes such as Structure Title, Release Date, Source Organism Taxonomy Name, etc. Limiting your search to specific attributes can yield more precise results. For instance, if you are looking for structures from a particular author, it is more efficient to limit your search to the Structure Author attribute. If your attribute search retrieves too many matches, you can construct complex queries to retrieve more manageable results. Complex queries can be constructed by using several attributes together, combining them with Boolean operators “AND”, “OR”, and “NOT”.
Documentation
Types of Structure Attributes
Attributes or properties available for searching the archive are grouped according to entry, entity, instance, assembly, experimental details, and annotations. These are briefly summarized here and described in detail below. See Attribute Details section for all available attributes.
Identifiers and Keywords
These attributes include identifiers assigned to the PDB structures or CSMs (entry), experimental maps (EMDB), macromolecules included in the structure (entities such as proteins or nucleic acids), and related keywords.
Entry-related Attributes
These attributes focus on properties of the entry (experimental structure) and include
- summary information about entry deposition (such as titles, authors, affiliations, and dates)
- entry composition (types and numbers of protein, DNA, and RNA macromolecules, molecular weights, and number of non-polymer entities in the entry)
- primary citations describing the entry, including citation information, abstract, and common identifiers, e.g., DOI
- attributes related to all citations that reference the entry
The Structure Determination Methodology attribute allows users to query experimental structures only, CSMs only, or both.
Computed Structure Model Attributes
These attributes pertain to computed structure model structures alone and include
- CSM entry identifier
- Source Database for CSMs
- The global quality score (pLDDT)
Entity-related Attributes
These attributes focus on properties of polymeric and non-polymeric molecules present in the entry. They include details about
- macromolecules (proteins or nucleic acids) including names, types, and length of polymers, mutations and modifications, organism taxonomies and enzyme
- classifications, and information on membrane association
- non-polymer small molecules including component ID for molecule of interest and binding properties
- oligosaccharide (or branched polymer) details including structural features and components
Instance-related Attributes
These attributes focus on annotations of each instance of polymeric entities, e.g., SCOP and CATH classifications.
Assembly-related Attributes
These attributes are related to biological assemblies, including size and composition of the assembly, experimental support for assembly assignment, and assembly symmetry.
Note: If you are expecting to see all assemblies that match your query, remember to change search results to Assemblies.
Sample, Experiment, and Method-related Attributes
These attributes can be used to design queries based on the structure determination and include details about
- experimental method types, including overall resolution, and software employed
- properties of crystals used for structure determination, including unit cell dimensions, space group, crystallization method(s)
- experiment-specific details for:
- X-ray crystallography, including attributes related to refinement, B-values, and R-values
- NMR data collection and refinement
- EM data collection and refinement
Types of Chemical Attributes
Small Molecule and BIRD Molecule Reference Data
These attributes enable queries based on the presence of specific chemical components and/or larger Biologically Interesting Molecules (or BIRD Molecules such as peptide-like inhibitors and antibiotic molecules) in the PDB. The attributes include chemical names and identifiers, atom counts, and molecular weights. These are useful for searching components that are parts of polymers and oligosaccharides (amino acids, nucleotides, saccharides etc.) as well as non-covalently bound ligands, inhibitors, and drugs.
Types of Operators and how to use them
Queries can be constructed by assigning values or ranges for selected attributes and combining them with suitable Boolean operators. Depending on the type of attribute, one can use different operators to create search conditions. Below are all the possible Operators for the different types.
Numerical and date attributes operators
Numerical values of these attributes can be used to identify 3D structures with values equal to (=), less/more than (>) a specified number. For some attributes ranges of values can be assigned, or the query can be used to identify structures with any value assigned to that attribute. These operators can be selected from the following list:
- =, >, >=, <, <= : standard mathematical operators
- range (upper excl.): range of numerical values for an attribute (from a lower bound to an upper bound), excluding the upper bound
- range (upper incl.): range of numerical values for an attribute (from a lower bound to an upper bound), including the upper bound
- is not empty: match for all entries that have any numerical data for the corresponding attribute. Entries with no data for the attribute are not matched. Note: this operator does not take any input value
- last 7 days: a relative date search for the last 1 week (from the date of the query). It allows you to create searches that can be run periodically without needing to alter the query. Note: this operator does not take any input value
- last 30 days: a relative date search for the last 1 month (from the date of the query). It allows you to create searches that can be run periodically without needing to alter the query. Note: this operator does not take any input value
Exact match text attributes operators
Text-based attributes may be of two types - ones that exactly match a given vocabulary list and those that are free form. Attributes of the former type can be included in queries as a specific word/phrase, a list of words/phrases, or simply based on whether or not the entry includes something in that attribute. The operators can be selected from the following list:
- is: the attribute must match the given text value exactly.
- is any of: the attribute must match any of the given values in a comma-separated list. The list can also be a single value.
- is not empty: this operator does not take an argument. It matches entries that have any text data for the corresponding attribute.
Free text attributes operators
Some text-based attributes are free-format, i.e., they do not use a specific vocabulary but may include specific words or phrases. Queries may also be designed to identify structures with any content in these attributes:
- has exact phrase: the attribute's text must contain the given phrase
- has any of words: the attribute's text must contain all the given words, in any order
- is not empty: this operator does not take an argument. It matches entries that have any text data for the corresponding attribute
Examples
Find all experimental structures that include a chlorophyll molecule
- Select Nonpolymer Molecular Features → Chemical Name and enter chlorophyll in the input box
- Change the operator to has exact phrase
- Click Search to return matching structures
Find myoglobin structures
- Select Polymer Molecular Features → Macromolecule Name and enter myoglobin in the input box
- Click Search to return matching experimental structures
- Enable Include CSM and run the search again to retrieve both matching experimental structures and predicted models
Find structures related to sperm whale myoglobin UniProt entry
- Select IDs and Keywords → Accession Code(s) - UniProt and enter P02185 in the input box
- Click Search to return matching experimental structures
- Enable Include CSM and run the search again to retrieve both matching experimental structures and predicted models
Find structures deposited by John Kendrew
- Select Deposition → Structure Author and enter Kendrew, J.C. in the input box
- Click Search to return matching structures
Find latest structures
- Select Structure Details → Release Date
- Change the operator to last 7 days
- Click Search to return matching structures
PDB structures are updated each week on Wednesday 00:00 UTC (Coordinated Universal Time).
Find structures of the antibiotic inactivating enzyme FosA
PDB structures of the FosA enzyme can be identified using annotations from the Comprehensive Antibiotic Resistance Database (CARD) resource, which provides curated information about antibiotic resistance genes, their products, and associated resistance mechanisms.
- Select Polymer Molecular Features → Lineage Identifier - CARD and enter CARD ID for FosA (ARO:3000149)
- Click Search to return matching structures
Find entries with four or more distinct proteins and at least one disulfide linkage
- Select Entry Features → Number of Distinct Protein Entities, change the operator to >=, and set the value to 4
- For the second attribute, select Entry Features → Disulfide Bond Count per Deposited Model, change the operator to >=, and set the value to 1
- Click Search to return matching structures
Find entries released in 2020 that don’t include DNA
- Select Structure Details → Release Date, change the operator to range, and enter 2020-01-01 and 2020-12-31
- For the second attribute, select Polymer Molecular Features → Polymer Entity Type and choose DNA
- Then click (+) NOT to negate the value
- Click Search to return matching structures
Find entries of human or mouse hydrolases that were determined using Electron Microscopy, and have a resolution better than 5Å
This query requires combining multiple attributes. Follow the steps below to construct the search:
- Specify the enzyme type (Hydrolases):
- Select Polymer Molecular Features → Enzyme Classification Number and enter "3" in the input box. EC number 3 designates hydrolases, a class of enzymes that catalyze hydrolysis reactions, breaking bonds by adding water
- Limit results by experimental method:
- Click (+) Attribute
- Select Methods → Experimental Method and choose Electron Microscopy
- Restrict results by resolution:
- Click (+) Attribute
- Select Methods → Refinement Resolution
- Change the operator to <
- Set the maximum resolution to 5 Å
- Specify the organisms (Human or Mouse). Because the search must match either organism, you will need to create a group of attributes:
- Click (+) Group
- Within the group, add Polymer Molecular Features → Scientific Name of the Source Organism. Enter Homo sapiens
- Populate a second attribute inside the same group: Molecular Features → Scientific Name of the Source Organism. Enter Mus musculus
- Set the group’s logical operator to OR
- Click Search to return matching structures
For Programmatic Users
Users who wish to run searches programmatically can explore search instructions and examples in the Search API documentation. Additional details about available API services are provided in the Web Services Overview.














