Tutorial: How to use the Substrate Search Tool
In this tutorial, we shall scan TCDB for TC systems involved in the transport
of specific substrate(s). Substrates in TCDB are annotated based on the
ChEBI ontology; thus, any
compound name available in ChEBI can be used. In our example, We will search for
all TCDB systems involved in the transport of "glucose" or "raffinose".
We shall also learn how to use the Substrate Statisitics Tool to count the numbers of different substrates transported by a given (sub)class, (sub)family or system. The functionality of this tool is independent of the substrate search tool, so you can jump directly to that part of the tutorial if you prefer.
Anatomy of the tool:
To identify all the systems in TCDB involved in the transport of specific substrates, we will use the Substrate Search Tool interface shown in Figure 1.
Figure 1. The anoatomy of the interface. A. Input box where the name of the query substrate should be entered. B. The Statistics Tool counts the number of times a substrate is annotated within a given TC (sub)class, (sub)family, or system. C. Candidate substrates will be displayed in this box. From the list of available candidates, specific substrates should be selected. D. Selected substrates from box C will be displayed here.
Step 1: identification of query substrates.
Enter the word Glucose in the search box (Figure 1A) and click on the "Search Substrates" button. Substrates matching the query will be shown in the "Search Results for:" box (Figure 1C) as illustrated in Figure 2.
Figure 2. Selecting your query substrates. A. Candidate substrate that can be selected to query the substrate ontology in TCDB.
After typing a substrate of interest in the search box (Figure 1A), glucose in this example, all substrates in the ontology matching that name will be presented in the "Search results for:" area (Figures 1C). The matched substrates are enclosed in boxes that also include the ChEBI identifier (Figure 2A). Now it is time to select the target substrates for the query.
Step 2: selection of substrates that will be used to query TCDB.
After clicking on the substrate(s) of your preference (Figure 2A), a popup window will be presented. In this window, and following the ChEBI ontology hierarchy, you can directly select the substrate you clicked on (Figure 3A), a less specific substrate (Parents; Figure 3B) or more specific substrate (Children; Figure 3C), if available.
Figure 3. Selecting specific substrates. A. Use directly the selected substrate (Figure 2) as part of the query to TCDB. B. Use a less specific substrate (parent compound in the ChEBI ontology) as part of the query. C. Use a more specific substrate (child compound in the ChEBI ontology) as part of the query.
Step 3: submit your query to TCDB
After Selecting the coumpound in the previous steps (Figures 2 and 3), the coumpound will appear on the right side of the screen (Figure 1D). This process can be repeated as many times as necessary to select all the substrates we want to use to query TCDB. In our case, we will select "Glucose" (Figures 1 and 4) and "Raffinose" (Figure 5).
Figure 4. Selecting "Glucose" as one of the query substrates. A. The selected substrate shows in the bar on the right of the screen (see also Figure 1D).
To search for the next substrate, we type "raffinose" in the search box (Figure 1A) and select the box "Raffinose" (Figure 5).
Figure 5. Selecting "Raffinose" as the second substrate. The red dashed box highlights the substrate that should be selected.
The second substrate "Raffinose" will appear below "Glucose" (Figure 6A). Now we have selected the two intended substrates (Figure 6) and can proceed to run our query by clicking on the "Search Substrates" button (Figure 6B).
Figure 6. The page after selecting "Raffinose" as the second substrate. A. The second substrate (Raffinose) is added below the previously selected substrate (glucose). B. After selecting all the targeted substrates, click on this button to submit the query to TCDB.
Step 4: analyzing the results
After submitting your query (Figure 6B), the results will be displayed in a 3-column table indicating the Substrate name, the TC of the system transporting the substrate, and the annotation for that system in TCDB (Figure 7).
Figure 7. Three-column table listing the systems in TCDB that transport the queried substrates. A. Substrate name. B. TCID of the system involved in the transport of each listed substrate. C. Current annotation of the system in TCDB. D. Input box to filter results by TCID.
When a query returns many substrates, results can be filtered by the TCID of a (sub)class, (sub)family, or specific system. For example, in this case we have 133 results but we can focus only in those within subclass 1.B. Just type "1.B" in the filter box (Figure 7D). As shown in Figure 8, the number of substrates returned changed from 133 to 7 (red dashed box).
Figure 8. Filtering transported substrates by subclass. The number of systems returned by the filter is enclosed in the red dashed box.
The rows in the table shown in Figure 8 are clickable and they take you directly to the page describing each system. For a detailed explanation of the linked pages, please follow our tutorial: The system component entry page.
Using the statistics Tool
The statistics tool shows the number of systems involved in the transport of different substrates within a (sub)class, (sub)family, or system. The functionality of this tool is independent from that of the substrate search tool described above. In this example, we will estimate the number of substrates transported by sytems under the Sugar Porter (SP) family (TC: 2.A.1.1) which is a member of the MFS superfamily.
To access the tool, click on the button "Go to Statistics Tool" (Figure 1B). The input form shown in Figure 9 will appear.
Figure 9. The Statistics Tool. A. Input box for the TCID that will be used for the analysis. B: Depth of the ChEBI ontology that should be analyzed to calculate the statistics. Higher numbers indicate that less specific substrates will be considered.
The input box for the TCID accepts the percentage symbol (%) as a wildcard, indicating that all systems from that point down the TC system hierarchy will be considered. In our case, we are targeting all systems under 2.A.1.1 (e.g., 2.A.1.1.1, 2.A.1.1.2, 2.A.1.1.3, etc.). Therefore, we need to type "2.A.1.1.%" in the input form (Figure 9A) and click on the button "Get Statistics". The results are shown in Figure 10.
Figure 10. Substrate statistics for TCID "2.A.1.1 and Depth "1". A. Navigation buttons that update the counts by moving up or down the ChEBI ontology hierarchy. The number of different substrates transported by the systems under the query TCID is shown above these buttons. B. Results are presented in a three-column table displaying the name of the substrate, the ChEBI ID of the substrate, and the number of systems transporting each substrate.
With the two navigation buttons (Figure 10A), we can adjust the substrates counts by incorporating less specific substrates. If we move up the ChEBI hierarchy, the counts will include each substrate's parent compounds. Thus, the counts will increase as we move up to higher (less specific) levels in the ChEBI ontology. Let us update the counts by going up 2 (Figure 11) and 3 (Figure 12) levels in the ChEBI hierarchy.
Figure 11. TCID "2.A.1.1.%" and search onotology depth "2". A. Counts for the top two substrates.
Figure 12. TCID "2.A.1.1.%" and search ontology depth "3". A. Counts for the top four substrates.
Compare the numbers when the statistics are calculated from depth 1 to 3 (Figures 10B, 11A and 12A). At Depth 1 (Figure 10B), the most frequent substrate with 49 occurrences is glucose. At Depth 2 (Figure 11A), the first parent compounds are included, and the less specific aldohexose becomes more frequent (65 occurrences) than glucose. At Depth 3 (Figure 12A), the most frequent substrate is now the even less specific compound hexose. However, this time the counts of glucose changed from 49 (Figure 10B) to 57. This is because there are several ontology paths that lead to glucose (6 in the chemical entity hierarchy). For further details, read the documentation of the ChEBI ontology. In general, the higher the level we select to calculate the statistics, the more ontology paths to the same compound will be considered in the counts. Therefore, make sure you select the depth level of your search carefully.