Search Modes
PubChemSR employs a GUI (Graphical User Interface) with reasonably self-explanatory sections and buttons. It currently supports the three different search modes: simple text search mode (in the main window), structure search mode (in the main window), and batch search mode (through the Tools menu). The simple text search mode and structure search mode provide the same search functionality as the NCBI's Entrez or PubChem basic structure search, while the batch search mode extends the batch Entrez in ways enabling users to run a list of queries and merge the results into a single file.
URL Analyzer
URL Analyzer can retrieve search results and display them in the search result view panel after users perform searches in their web-browser. The full URL of the results web-page can be copied into the clipboard using 'Copy' or 'Ctrl+C'. The user can then paste the URL into the URL analyzer by clicking the Get button or by pasting into the box. The Anal button will check the URL and retrieve the search results into the preview panel. This feature becomes extremely useful when a search can not be completed within a specified time (default is 120 seconds) or is not supported in PubChemSR. Such examples include structure searches for similar/substructure compounds or advanced structure searches supporting additional filters like chemical property or BioActivity.
Bulk Download
Bulk download enables users to download information on compounds en masse and only export the desired data fields for each compound. Needed are a list of UIDs (Unique Identifiers: CID for compounds, SID for substances, and AID for BioAssay), which can be obtained through the simple text search or be uploaded from a file. The buttons in the 'Retrieve' panel will either directly save the data into a text file or display them first in a separate window giving further options to export the data into Microsoft Excel or HTML file.
Other features
Several other available features are offered by the program including term correction for misspelled queries – misspelled queries can be automatically corrected via NCBI E-spell web-service; selectable data field – for bulk download, the results can be filtered to only include fields of interest to the user; preview with picture – the search result view panel provides a summary of the results ten compounds at a time with preview of structure and selected data fields; and BioAssay retriever – retrieves the actual bioassay activity data and exports them along with selected compounds/substance data fields to Microsoft Excel or text files.
Examples of Use
There are many ways that PubChemSR can be used to simplify the process of obtaining information from PubChem. Below are listed a few examples of how it can be employed for common tasks.
Comparing chemical properties of related compounds
It is often useful to compare the properties of compounds in a particular structural class. This is very easy to do using the refinement and Excel export functions. Figures 1 and 2 show respectively a search for 'acetaminophen' using PubChemSR, and an Excel spreadsheet created by exporting selected property-related fields from the program. This kind of comparison may also be done with a substructure or similarity search instead of a simple text search.
Browsing bioassays related to kinases, and downloading active compounds in specific assays
Using a text search on the PubChem BioAssay database, one can find all of the assay descriptions that contain particular keywords such as "Kinase". One can then export all of these descriptions to Excel or a text file, or browse them from within the program (as shown in Figure 3) In particular, one can download statistics of assays (counts of active and inactive structures and so on) and use Excel to analyze these (see Figure 4). Upon finding assays of interest, one can retrieve all of the compounds (and related information) that have been flagged as showing activity in that assay by supplying the assay ID to the bioassay retriever as shown in Figure 5. These compounds can then be exported just as with a regular compound search.
Creating a SMILES and activity file for SAR study of an assay
SMILES is a linear text string representation of the 2D chemical structure of a compound. A SMILES file usually contains the SMILES string and name for a compound. When a third column is added that contains biological activity values for a compound, it is a useful format for input into a variety of cheminformatics techniques that can automatically determine structure-activity relationships (SAR) in compounds. Using the BioAssay Retriever, one can download just the SMILES, name, and biological assay results for compounds and then create a simple tab-delimited file that can be loaded into cheminformatics tools.