DatCat, developed and run by CAIDA, is an Internet Measurement Data Catalog (IMDC), a searchable registry of information about network measurement datasets. It serves the global network research community by allowing anyone to find, annotate, and cite data contributed by others, and allowing anyone to contribute new data collections.
The goals of DatCat IMDC are:
Finding data to use in network research has historically been difficult. By serving as a shared global resource where anyone can find the data needed for network analysis, DatCat mitigates a significant barrier to research.
Instead of relying on the data contributor alone to document the data, DatCat allows any researcher to annotate datasets with problems, features, or missing information they discover in the data, thereby increasing the utility of the datasets.
Reproducibility of results is a cornerstone of good science, but requires that the researcher's data is available to others. Similarly, to get the most meaningful comparison of analysis methodologies and algorithms, researchers must test them against the same data. By putting their data in DatCat or using data already in DatCat, and then citing the IMDC Handle in their published results, researchers can make it easier for others to obtain their data and validate their results or perform alternate analyses on the same data.
Note that IMDC does not store the data (or tools) itself, but only metadata, that is, descriptions of the data and instructions for obtaining it. The storage of the data itself remains in the hands of the contributor. As such, it may or may not be freely available; it might, for example, reside on a password-protected server, or require asking the owner of the data. IMDC does not dictate the terms of availability of the data, it just helps you with the first step of finding the data.
DatCat development was supported by grant ANI-0137121 from the Advanced Networking Infrastructure program of the National Science Foundation.
See also the IMDC white paper.
Information in IMDC is organized as Objects, each of which describes a real-world object or idea. For the purposes of finding and obtaining data, the most important types of objects are:
The following typographic and styling conventions are used throughout DatCat. Their exact appearance will depend on your web browser.
|imdc||The name of an object in the catalog||The object's detail page|
|wireless data collections||A description of a set of objects||Search results matching the description|
|Object Types||Help topic||The page for the help topic|
|info@datcat.||Email address||A |
|advanced search||Other internal reference||An internal page|
|CAIDA||An external reference||An external site|
Object names within IMDC are not necessarily unique. Some objects (e.g.,
Locations) do not even have names. To uniquely identify objects,
IMDC assigns each one a persistent Handle, for example
IMDC Handles are designed to last forever, making them ideal
for use in citing data in a research paper.
to the beginning of a handle will
produce the URL of the object's detail page.
Informally, the syntax of an IMDC Handle is:
imdc.datcat.org. In the future, there may be other instances of IMDC systems with different names. Within an IMDC instance, the instanceName is often omitted, in which case the current instance is implied.
1. If changes to the handle syntax are made in the future, handles generated after that point may contain a new version number, but all handles generated previously will remain valid.
are the digits zero and one, not letters.
Any logged-in user can annotate any IMDC object with a "note" containing additional information.
Although creating an account is not a prerequisite for browsing DatCat, we strongly encourage it. Creating an account is easy and free, and allows you to annotate catalog entries and customize your interaction with DatCat.
You must have an account to contribute information to IMDC.