ERDDAP ERDDAPY: Difference between revisions
No edit summary |
No edit summary |
||
Line 161: | Line 161: | ||
; protocol : choose between ''tabledap'' or ''griddap''. | ; protocol : choose between ''tabledap'' or ''griddap''. | ||
; search_for : “Google-like” search of the datasets’ metadata. | ; search_for : “Google-like” search of the datasets’ metadata. | ||
<blockquote> | |||
* Type the words you want to search for, with spaces between the words. ERDDAP will search for the words separately, not as a phrase. | * Type the words you want to search for, with spaces between the words. ERDDAP will search for the words separately, not as a phrase. | ||
* To search for a phrase, put double quotes around the phrase (for example, <kbd>"wind speed"</kbd>). | * To search for a phrase, put double quotes around the phrase (for example, <kbd>"wind speed"</kbd>). | ||
Line 172: | Line 171: | ||
* In this ERDDAP, you can search for any part of a word. For example, searching for <kbd>spee</kbd> will find datasets with <kbd>speed</kbd> and datasets with <kbd>WindSpeed</kbd>. | * In this ERDDAP, you can search for any part of a word. For example, searching for <kbd>spee</kbd> will find datasets with <kbd>speed</kbd> and datasets with <kbd>WindSpeed</kbd>. | ||
* In this ERDDAP, the last word in a phrase may be a partial word. For example, to find datasets from a specific website (usually the start of the datasetID), include (for example) <kbd>"datasetID=<i>erd</i>"</kbd> in your search. | * In this ERDDAP, the last word in a phrase may be a partial word. For example, to find datasets from a specific website (usually the start of the datasetID), include (for example) <kbd>"datasetID=<i>erd</i>"</kbd> in your search. | ||
</blockquote> | |||
# show datasets selected by full text search | |||
e.response='csv' | |||
e.protocol='tabledap' | |||
url = e.get_search_url(search_for='fCO2') | |||
df = pd.read_csv(url) | |||
print(f'{len(set(df["tabledap"].dropna()))} matching tabledap datasets') | |||
<nowiki>df[['griddap','tabledap','Dataset ID']].head()</nowiki> | |||
119 matching tabledap datasets | |||
{| class="wikitable" | |||
|- | |||
! !! griddap !! tabledap !! Dataset ID | |||
|- | |||
| 0 || NaN || https://erddap.icos-cp.eu/erddap/tabledap/allD... || allDatasets | |||
|- | |||
| 1 || NaN || https://erddap.icos-cp.eu/erddap/tabledap/icos... || icos26na20170409SocatEnhanced | |||
|- | |||
| 2 || NaN || https://erddap.icos-cp.eu/erddap/tabledap/icos... || icos26na20170421SocatEnhanced | |||
|- | |||
| 3 || NaN || https://erddap.icos-cp.eu/erddap/tabledap/icos... || icos26na20170430SocatEnhanced | |||
|- | |||
| 4 || NaN || https://erddap.icos-cp.eu/erddap/tabledap/icos... || icos26na20170511SocatEnhanced | |||
|} | |||
=== How to get info on metadata === | |||
erddapy come with a method to explore dataset's metadata named '''get_info_url''' | |||
# get metadata information | |||
e.response='csv' | |||
e.dataset_id=df['Dataset ID'].values[1] | |||
info_url = e.get_info_url() | |||
info = pd.read_csv(info_url) | |||
info.head(6) | |||
{| class="wikitable" | |||
|- | |||
! !! Row Type !! Variable Name !! Attribute Name !! Data Type !! Value | |||
|- | |||
| 0 || attribute || NC_GLOBAL || acquisition_ended_at_time || String || 2017-04-16T14:21:09Z | |||
|- | |||
| 1 || attribute || NC_GLOBAL || acquisition_started_at_time || String || 2017-04-10T14:01:01Z | |||
|- | |||
| 2 || attribute || NC_GLOBAL || acquisition_station_class || String || 1 | |||
|- | |||
| 3 || attribute || NC_GLOBAL || acquisition_station_comment || String || The research vessel (R/V) G.O. Sars is own and... | |||
|- | |||
| 4 || attribute || NC_GLOBAL || acquisition_station_country_code || String || NO | |||
|- | |||
| 5 || attribute || NC_GLOBAL || acquisition_station_id || String || 58G2 | |||
|} |
Revision as of 14:15, 19 January 2022
How to use erddapy
First of all, we need to instantiate the ERDDAP URL constructor for a server.
- server
- an ERDDAP server URL or an acronym for one of the builtin servers.
from erddapy import ERDDAP import pandas as pd e = ERDDAP(server="https://erddap.bcdc.no/erddap")
To explore the methods and attributes available in the ERDDAP object
[method for method in dir(e) if not method.startswith("_")]
['auth', 'constraints', 'dataset_id', 'get_categorize_url', 'get_download_url', 'get_info_url', 'get_search_url', 'get_var_by_attr', 'protocol', 'relative_constraints', 'requests_kwargs', 'response', 'server', 'server_functions', 'to_iris', 'to_ncCF', 'to_pandas', 'to_xarray', 'variables']
Note: All the methods prefixed with get_ will return a valid ERDDAP URL for the requested response and options.
To get help on method
help(e.get_search_url)
- Help on method get_search_url in module erddapy.erddapy:
- get_search_url(response: Union[str, NoneType] = None, search_for: Union[str, NoneType] = None, protocol: Union[str, NoneType] = None, items_per_page: int = 1000, page: int = 1, **kwargs) -> str method of erddapy.erddapy.ERDDAP instance
- The search URL for the `server` endpoint provided.
- Args:
- search_for: "Google-like" search of the datasets' metadata.
- - Type the words you want to search for, with spaces between the words.
- ERDDAP will search for the words separately, not as a phrase.
- - To search for a phrase, put double quotes around the phrase (for example, `"wind speed"`).
- - To exclude datasets with a specific word, use `-excludedWord`.
- - To exclude datasets with a specific phrase, use `-"excluded phrase"`
- - Searches are not case-sensitive.
- - You can search for any part of a word. For example, searching for `spee` will find datasets with `speed` and datasets with `WindSpeed`
- - The last word in a phrase may be a partial word. For example, to find datasets from a specific website (usually the start of the datasetID), include (for example) `"datasetID=erd"` in your search.
- response: default is HTML.
- items_per_page: how many items per page in the return, default is 1000.
- page: which page to display, default is the first page (1).
- kwargs: extra search constraints based on metadata and/or coordinates ke/value.
- metadata: `cdm_data_type`, `institution`, `ioos_category`, `keywords`, `long_name`, `standard_name`, and `variableName`.
- coordinates: `minLon`, `maxLon`, `minLat`, `maxLat`, `minTime`, and `maxTime`.
- Returns:
- url: the search URL.
Then ERDDAP's users can:
- access to the list of all datasets available through this ERDDAP server
- access to the list of datasets by type (grid, tabular,..)
access to the list of all datasets available through this ERDDAP server
Here we use the get_search_url method
# show all datasets url = e.get_search_url() print(url)
we also specify the response attribute in our ERDDAP instance.
- response
- specifies the type of table data file that you want to download (default html). There are many response available, see the docs for griddap and tabledap respectively.
# show all datasets e.response='csv' url = e.get_search_url(search_for="all") df = pd.read_csv(url) df[['griddap','tabledap','Dataset ID']].head()
griddap | tabledap | Dataset ID | |
---|---|---|---|
0 | NaN | https://erddap.icos-cp.eu/erddap/tabledap/allD... | allDatasets |
1 | NaN | https://erddap.icos-cp.eu/erddap/tabledap/icos... | icos26na20170409SocatEnhanced |
2 | NaN | https://erddap.icos-cp.eu/erddap/tabledap/icos... | icos26na20170421SocatEnhanced |
3 | NaN | https://erddap.icos-cp.eu/erddap/tabledap/icos... | icos26na20170430SocatEnhanced |
4 | NaN | https://erddap.icos-cp.eu/erddap/tabledap/icos... | icos26na20170511SocatEnhanced |
access to the list of datasets by type (grid, tabular,..)
Here we use the get_search_url method, we also specify the response and protocol attributes in our ERDDAP instance
- response
- specifies the type of table data file that you want to download (default html).
- protocol
- choose between tabledap or griddap.
# show datasets by type e.response='csv' e.protocol='tabledap' url = e.get_search_url() df = pd.read_csv(url) df[['griddap','tabledap','Dataset ID']].head()
griddap | tabledap | Dataset ID | |
---|---|---|---|
0 | NaN | https://erddap.icos-cp.eu/erddap/tabledap/allD... | allDatasets |
1 | NaN | https://erddap.icos-cp.eu/erddap/tabledap/icos... | icos26na20170409SocatEnhanced |
2 | NaN | https://erddap.icos-cp.eu/erddap/tabledap/icos... | icos26na20170421SocatEnhanced |
3 | NaN | https://erddap.icos-cp.eu/erddap/tabledap/icos... | icos26na20170430SocatEnhanced |
4 | NaN | https://erddap.icos-cp.eu/erddap/tabledap/icos... | icos26na20170511SocatEnhanced |
But as user you probably don't want to use all datasets and you surely don't want to look in all of them to find which ones have the data you are interesting in.
How to search datasets
ERDDAP's users can select datasets:
- Full text search (Google-like search of the datasets' metadata)
- Category search
- Advanced search
How to use Full text search
Here we use the get_search_url method, we also specify the `response`, and `protocol` attributes in our ERDDAP instance.
- response
- specifies the type of table data file that you want to download (default html).
- protocol
- choose between tabledap or griddap.
- search_for
- “Google-like” search of the datasets’ metadata.
- Type the words you want to search for, with spaces between the words. ERDDAP will search for the words separately, not as a phrase.
- To search for a phrase, put double quotes around the phrase (for example, "wind speed").
- To exclude datasets with a specific word, use -excludedWord .
- To exclude datasets with a specific phrase, use -"excluded phrase" .
- Don't use AND between search terms. It is implied. The results will include only the datasets that have all of the specified words and phrases (and none of the excluded words and phrases) in the dataset's metadata (data about the dataset).
- Searches are not case-sensitive.
- To search for specific attribute values, use attName=attValue .
- To find just grid or just table datasets, include protocol=griddap or protocol=tabledap in your search.
- This ERDDAP is using searchEngine=original.
- In this ERDDAP, you can search for any part of a word. For example, searching for spee will find datasets with speed and datasets with WindSpeed.
- In this ERDDAP, the last word in a phrase may be a partial word. For example, to find datasets from a specific website (usually the start of the datasetID), include (for example) "datasetID=erd" in your search.
# show datasets selected by full text search e.response='csv' e.protocol='tabledap' url = e.get_search_url(search_for='fCO2') df = pd.read_csv(url) print(f'{len(set(df["tabledap"].dropna()))} matching tabledap datasets') df[['griddap','tabledap','Dataset ID']].head()
119 matching tabledap datasets
griddap | tabledap | Dataset ID | |
---|---|---|---|
0 | NaN | https://erddap.icos-cp.eu/erddap/tabledap/allD... | allDatasets |
1 | NaN | https://erddap.icos-cp.eu/erddap/tabledap/icos... | icos26na20170409SocatEnhanced |
2 | NaN | https://erddap.icos-cp.eu/erddap/tabledap/icos... | icos26na20170421SocatEnhanced |
3 | NaN | https://erddap.icos-cp.eu/erddap/tabledap/icos... | icos26na20170430SocatEnhanced |
4 | NaN | https://erddap.icos-cp.eu/erddap/tabledap/icos... | icos26na20170511SocatEnhanced |
How to get info on metadata
erddapy come with a method to explore dataset's metadata named get_info_url
# get metadata information e.response='csv' e.dataset_id=df['Dataset ID'].values[1] info_url = e.get_info_url() info = pd.read_csv(info_url) info.head(6)
Row Type | Variable Name | Attribute Name | Data Type | Value | |
---|---|---|---|---|---|
0 | attribute | NC_GLOBAL | acquisition_ended_at_time | String | 2017-04-16T14:21:09Z |
1 | attribute | NC_GLOBAL | acquisition_started_at_time | String | 2017-04-10T14:01:01Z |
2 | attribute | NC_GLOBAL | acquisition_station_class | String | 1 |
3 | attribute | NC_GLOBAL | acquisition_station_comment | String | The research vessel (R/V) G.O. Sars is own and... |
4 | attribute | NC_GLOBAL | acquisition_station_country_code | String | NO |
5 | attribute | NC_GLOBAL | acquisition_station_id | String | 58G2 |