Information in IMDC is organized as Objects, each of which
describes a real-world object.
Common Fields
The following fields appear on multiple types of IMDC objects.
- handle
- An IMDC Handle that uniquely identifies
the object.
- name
- The object's full name, which will be displayed to other users.
If the name is not unique, it may be displayed with a numeric suffix
to make it unique.
- contributor
- The contact who contributed the
metadata for this object to IMDC.
Usually an object's contributor is the only user allowed to edit it.
- creators
(on Format,
Collection,
Data,
Package,
Location)
- A list of contacts who created the
real-world item described by this IMDC object
- primary contact
(on Format,
Collection,
Data,
Package,
Location)
- Who to contact with questions about the real-world item described by
this IMDC object, if different from the creators.
- creation time
- The time at which this object was contributed to IMDC
- modification time
- The time of the most recent modification to this IMDC object
(i.e., the metadata, not the real-world item)
- short description
(on Contact,
Format,
Collection,
Data,
Package,
Annkey)
- A short
(up to 128 characters)
description of the object, displayed
in tables containing the object and at the top of the object's detail page
- (long) description
(on Contact,
Format,
Collection,
Data,
Package,
Annkey)
- A much longer description of the object, displayed only on the
object's detail page
- description URL
(on Contact,
Format,
Collection,
Data,
Package,
Annkey)
- The URL of a web page that describes the object.
- keywords
(on Format,
Collection,
Data,
Package)
- A list of words or short phrases that describe important features of
this object, useful for searching.
- private ID
- A string useful for referring to another database's representation
of the same real-world item described by this IMDC object.
See Private IDs in the
Contributing documentation.
- state
- "active" objects are visible to users other than the contributor;
"staged" objects are not, and can be deleted.
See the Contributing documentation.
- citation
- A BibTeX entry that can be used to cite the object.
Data
The core of IMDC is the Data object.
A Data object describes a dataset contained in a single file in its most
natural working form, even if the data is not made available directly
in that form. A Data object must belong to at least one Package object.
The name of the IMDC Data object is typically, but not necessarily,
the same as the filename used on real-world copies of the data file.
However, because copies of the file can have different filenames, the
filename is specified within the Package object, and a single Data object
can describe all the copies of the same data.
Fields:
- File size
- The size of the (uncompressed) data file, in bytes
- Format
- The Format of the datafile
- Geographic Location
- The geographic source of the data collection,
in terms of continent, country, state, province, city, etc.
- Network Location
- Where on the network the data was collected, in terms of hostname,
IP address, AS, etc.
- Logistic Location
- The source of the data from an organizational viewpoint, e.g.
X-root DNS server
or University of Freedonia
off-campus link
- Platform
- The hardware, software, and OS used to collect the data
- Group tags
- A list of names of groups of closely related objects to
which this object belongs. See also
Collections.
- Time Zone
- The time zone in which the data was collected
- Start Time and End Time
- The time at which data collection started or ended
- Duration
- The time period covered by the data
- Creation Process
- A text description of the procedure used to collect the data
- MD5 hash
- The MD5 fingerprint of the datafile, displayed as 32 hexadecimal digits.
- handle,
name,
contributor,
creators,
primary contact,
creation time,
modification time,
short description,
description,
description URL,
keywords,
private ID
- See Common Fields above.
Collection
A Collection is a set of closely related Data objects, often collected
as part of a single effort.
A Collection may contain other Collections or Publications to indicate that
it contains all the data contained by the others.
For example, Collections named "F-root DNS traces" and "A-root DNS traces"
might each contain hundreds of Data objects representing traces taken at
the respective root DNS servers, while "Root DNS traces" might contain both of
those Collections and thus indirectly contain all Data contained by those
Collections.
Fields:
- Contents
- A set of
Data,
Collections, and
Publications
belonging to this
Collection.
- Summary
- A medium-length description of this Collection
and its purpose.
- Motivation
- The motivation behind the creation of this collection,
i.e. why the collection's creators thought it would be useful for the
contents of this collection to be gathered together.
- Start Time, End Time, Duration
- These fields roughly describe the time period covered by the data
objects in the collection, although there may be gaps in the coverage.
- handle,
name,
contributor,
creators,
primary contact,
creation time,
modification time,
short description,
description,
description URL,
keywords,
private ID
- See Common Fields above.
Publication
A Publication describes a scholarly paper, article, or other publication
that uses Internet measurement data.
IMDC is not meant to be a catalog of publications, as
that service is already performed by other sites such as
Google Scholar and
CiteSeer.
Rather, the primary purpose of indexing publications in IMDC
is to index the data used by the publications.
A Publication object in IMDC can be thought of as a
specialized kind of Collection
whose Data contents represent the data used by the publication.
Like a Collection, a Publication may contain other Collection or Publication
objects in order to incorporate their contents.
Fields:
- Title
- The publication's full title.
- Authors
- A list of contacts who wrote the
publication
- Data Used
- A list of Data used by this
Publication.
- Venue
- The name of the conference, journal, magazine, or other venue where
the publication was published.
- Summary
- A medium-length description of this Publication.
- Abstract
- A much longer description of the object, displayed only on the
object's detail page
- URL
- A URL where the full publication can be found.
- Publication Date
- The year and optionally month and day of publication
- Start Time, End Time, Duration
- These fields roughly describe the time period covered by the data
objects used by the publication, although there may be gaps in the coverage.
- handle,
contributor,
primary contact,
creation time,
modification time,
keywords,
private ID
- See Common Fields above.
Package
A Package object describes a set of one or more data files,
in a form that can be downloaded or otherwise made available.
Package objects usually represent compressed archives of data files,
but can be as simple as a single uncompressed data file, if
that file is the downloadable form. Each member of a package's
contents has a path
(full filename) associated with it,
which describes exactly how the file is embedded within the package.
For a not packaged
package,
where the package and data describe the same file, the path is simply
the filename of the data file. But for an archive package
(e.g., one in tar
format),
the content's path is the path to the data file within the
package archive file.
Fields:
- Package size
- The size of the package file, in bytes
- Format
- The Format of the package file
- Contents
- A list of Data and Package objects for the datafiles contained
in this package and their paths within the package file.
- MD5 hash
- The MD5 fingerprint of the datafile
- handle,
name,
contributor,
creators,
primary contact,
creation time,
modification time,
short description,
description,
description URL,
keywords,
private ID
- See Common Fields above.
Location
A Location object represents a method for obtaining a single Package.
Packages available via multiple means (e.g., mirrors) will have
a distinct Location object for each.
Often a Location will include a URL linked directly to the
package file (external to IMDC),
although it may be password-protected or otherwise restricted.
If the Location is restricted in any way, such as requiring owner approval
or agreement to an AUP, there will be instructions on how to obtain the
package.
Fields:
- Package
- The Package at this Location
- Download URL
- A URL linked directly to a downloadable copy of the package file.
If there is no such direct link, this field should be blank,
and Download Procedure must not be blank.
- Download Procedure
- Text instructions on how to obtain the package. May be blank only
if Location URL contains a URL with unrestricted access.
- Geographic Location
- The geographic location of the server described in this Location,
to help users choose between multiple Locations of the same Package.
- Logistic Location
- The location of the package from an organizational viewpoint, e.g.
CAIDA
or University of Freedonia
.
- Availability
Free
if anyone may obtain the package without
restrictions, or Restricted
if obtaining the package
requires a password or agreement to an AUP or has other restrictions.
- Status
- The status of this Location:
Active
or
Disabled
.
- handle,
name,
contributor,
creators,
primary contact,
creation time,
modification time,
short description,
description,
description URL,
keywords,
private ID
- See Common Fields above.
Contact
Contact objects describe a person or role. Every IMDC login account is
a Contact. Contacts are also used to describe the creators of data, packages,
and other cataloged items, even if those creators do not have IMDC logins.
For contacts with logins, the contact's contributor is the contact itself.
Fields used only on login accounts:
- login
- A unique string of
3-80
non-space characters used to log in to IMDC.
- password
- A secret string of
4-40
characters.
- private email
- An address at which IMDC can send email to the user. This address
will never be given out to anyone else.
- time zone
- The user's preferred time zone. This is used for displaying and
optionally inputting dates and times.
- display rows
- The number of search results to display on a single page.
Fields used on all contacts:
- Type
person
if this contact describes an individual person
(e.g. Ken Keys
),
or role
if this contact describes a job or function
independently of the person or people who perform it
(e.g. CAIDA Data Manager
).
New accounts are always created with type person
,
but can be edited later to change their type to role
.
- (public) email
- An email address that will be given out to other users.
- phone
- The contact's phone number.
You may choose to hide this from other users.
- address
- The contact's postal address.
You may choose to hide this from other users.
- country
- The contact's country.
You may choose to hide this from other users.
- organization
- The organization to which the contact belongs.
- handle,
name,
contributor,
creation time,
modification time,
short description,
description,
description URL,
keywords,
private ID
- See Common Fields above.
Data Formats and Package Formats
The two Format types describe file formats used for
data files (e.g.,
pcap
)
and package files (e.g.,
tar-gzip
).
There also may be
Annotation Keys associated with a
format, which have the effect of adding new fields to objects that have that
format.
Fields:
- Type
- Describes whether the file format is binary, text, or mixed.
- Suffixes
- A list of suffixes commonly used on the names of files of this format
(e.g.,
.pcap
or .tar.gz
).
- Can contain multiple files
(Package formats only)
- Describes whether packages with this format can contain multiple
files or only one.
- handle,
name,
contributor,
creators,
primary contact,
creation time,
modification time,
short description,
description,
description URL,
keywords,
private ID
- See Common Fields above.
Annotations
An annotation is an extra piece of information attached to an IMDC object,
consisting of an
Annotation Key (described below)
and a value. The key defines the range of possible values, how to
interpret them, and who has permission to create and edit annotations.
Some annotations act like extensions to an object's built-in fields;
the owner of an object can usually take control over these annotations even
if they were contributed by someone else, in order to maintain the integrity
of the object.
Others, such as those with key
note
,
can be created by anyone who wants to attach a note to an object,
and can not be edited by the object's owner.
Fields:
- Object
- The object to which this Annotation refers.
- Fragment
- Not yet implemented. In the future, this will allow annotations
to refer to a particular
fragment
of an object,
rather than the entire object.
- Key
- The Annotation Key of this Annotation.
- Value
- The value of this annotation, with type according to the
Annotation Key.
- handle,
contributor,
creation time,
modification time,
private ID
- See Common Fields above.
Annotation Keys (Annkeys)
An annotation's key describes the
range of possible values, how to interpret the values, and who has permission
to create and edit annotations. The most important part of a Key is its name,
but the full specification of a Key also includes
a definer who defined the key,
the object type to which the key can be applied,
and the format of objects to which the key can be applied.
The definer, object type, and format together comprise a namespace; that is,
the combination of definer, object type, format, and name must be unique.
If a Key's definer is its Format, the Key
is an extension of the definition of that Format. Otherwise,
the key's definer is the contact who contributed the key.
There is a
set of predefined
Keys defined by a contact named
imdc,
and users can define new Keys.
Fields:
- Definer
- The definer is equal to either the Format or the Contributor.
If equal to the Format, then this Key is an extension of the definition
of that Format. In search result tables, this will be indicated by
the text
format
rather than the full name of the format.
Otherwise, the definer is the contact who contributed this Key.
- Object Type
- If set, Annotations
with this Key may only be applied to objects with this Object Type.
The object type is part of the namespace.
- Format
- If set, Annotations
with this Key may only be applied to
(data or package) objects with this
Format.
The format is part of the namespace.
- name
- The name of the Key. The name must be unique for a given
combination of definer, object type, and format. Additionally,
names defined by imdc
have a hierarchical structure within the name itself;
you are encouraged to use this structure in any keys you define.
For example, a prefix of
cfg.
indicates a key for
an annotation that describes a configuration parameter;
cfg.active.
is for configuration parameters of active probing;
and cfg.active.ping.
is for configuration parameters of
active probing with a ping
-like program.
For more, browse the
Annotation Keys with names ending in .
.
- Fragment type
- Not yet implemented. In the future, this will specify the type of
fragment to which annotations with this key can refer.
- Value type
- The type of value that can be stored in
Annotations with this Key:
none, string, integer, real, or boolean.
- Unique
- If
yes
, only one
Annotation with this Key
may be associated with any given object.
- Required
- If
yes
, every object of the corresponding Object Type
and Format must have at least one
Annotation with this Key.
- Creation permissions
- Permission to create annotations with this Key can optionally be
granted to one or more of the following classes of users:
- anyone (note that many keys that allow anyone to create an annotation
will also allow the object owner to edit the annotation)
- object owner (most annotation keys should have this set to "yes")
- key owner
- Editing permissions
- An Annotation's
owner always has permission to edit the annotation.
Additionally, permission to edit annotations with this Key can optionally
be granted to the owner of the object to which the annotation is attached.
That is, the owner (contributor) of an object
can edit Annotations with this Key on that object,
even if that Annotation
was contributed by someone else. When
the object contributor edits someone else's annotation, the editor
becomes the new contributor of that annotation, so the original
annotation contributor can no longer edit it.
For example, if Alice adds an annotation
to an object contributed by Bob, then both Alice and Bob will be allowed to
edit the annotation. But if Bob does edit the annotation, he will become
its owner, and Alice will no longer have any control over it.
- handle,
contributor,
creation time,
modification time,
short description,
description,
description URL,
private ID
- See Common Fields above.