Scale Documentation¶
Scale is a system that provides management of automated processing on a cluster of machines.
Overview¶
Scale is a system that provides management of automated processing on a cluster of machines. It allows users to define jobs, which can be any type of script or algorithm. These jobs run on ingested source data and produce product files. The produced products can be disseminated to appropriate users and/or used to evaluate the producing algorithm in terms of performance and accuracy.
Mesos and Nodes
Scale runs across a cluster of networked machines (called nodes) that process
the jobs. Scale utilizes Apache Mesos, a free and open source project, for
managing the available resources on the nodes. Mesos informs Scale of available
computing resources and Scale schedules jobs to run on those resources.
Ingest
Scale ingests source files using a Scale component called Strike. Strike is a
process that monitors an ingest directory into which source data files are
being copied. After a new source data file has been ingested, Scale produces
and places jobs on the queue depending on the type of the ingested file. Many
Strike processes can be run simultaneously, allowing Scale to monitor many
different ingest directories.
Jobs
Scale creates jobs based on its known job types. A job type defines key
characteristics about an algorithm that Scale needs to know in order to run it
(what command to run, the algorithm’s inputs and outputs, etc.) Job types are
labeled with versions, allowing Scale to run multiple versions of the same
algorithm. Jobs may be created automatically due to an event, such as the
ingest of a particular type of source data file, or they may be created
manually by a user. Jobs that need to be executed are placed onto and
prioritized within a queue before being scheduled onto an available node. When
multiple jobs need to be run in a serial or parallel sequence, a recipe can
be created that defines the job workflow.
Products
Jobs can produce products as a result of their successful execution. Products
may be disseminated to users or used to analyze and improve the algorithms that
produced them. Scale allows the creation of different workspaces. A workspace
defines a separate location for storing source or product files. When a job is
created, it is given a workspace to use for storing its results, allowing a
user to control whether the job’s results are available to a wider audience or
are restricted to a private workspace for the user’s own use.
Interface¶
Scale provides a powerful web user interface that is built on top of a RESTful HTTP layer. The web UI provides an easy way to monitor and manage all of the jobs and data that Scale is processing. The RESTful HTTP interface provides a way for external applications to query Scale for status information or new product information.
Web UI¶
TODO
RESTful HTTP Interface¶
Scale provides a RESTful HTTP interface for its own web UI and for any external applications that would like to connect to Scale. The following sections describe the services available for each component of Scale.
Error Services¶
These services provide access to information about registered errors and error mappings.
Error List | |||
---|---|---|---|
Returns a list of all errors. | |||
GET /errors/ | |||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
order | String | Optional | One or more fields to use when ordering the results. Include multiple times to multi-sort, (ex: order=name&order=version). Prefix the field with a dash ‘-‘ to reverse the order, (ex: order=-name). |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.id | Integer | The unique identifier of the model. Can be passed to the details API call. (See Error Details) | |
.name | String | The stable name of the error used for queries. | |
.title | String | The human readable display name of the error. | |
.description | String | A longer description of the error. | |
.category | String | The category of the error. Choices: [SYSTEM, ALGORITHM, DATA]. | |
.created | ISO-8601 Datetime | When the associated database model was initially created. | |
.last_modified | ISO-8601 Datetime | When the associated database model was last saved. | |
{
"count": 23,
"next": null,
"previous": null,
"results": [
{
"id": 1,
"name": "unknown",
"title": "Unknown",
"description": "The error that caused the failure is unknown.",
"category": "SYSTEM",
"created": "2015-03-11T00:00:00Z",
"last_modified": "2015-03-11T00:00:00Z"
},
...
]
}
|
Create Error | ||
---|---|---|
Creates a new error | ||
POST /errors/ | ||
Content Type | application/json | |
JSON Fields | ||
name | String | The stable name of the error used for queries. |
title | String | The human readable display name of the error. |
description | String | A longer description of the error. |
category | String | The category of the error. Choices: [ALGORITHM, DATA]. |
{
"name": "error1",
"title": "Error 1",
"description": "This is an algorithm error",
"category": "ALGORITHM"
}
|
||
Successful Response | ||
Status | 201 CREATED | |
Location | URL pointing to the details for the newly created error | |
Content Type | application/json | |
JSON Fields | ||
JSON Object | All fields are the same as the error details model. (See Error Details) | |
{
"id": 100,
"name": "error1",
"title": "Error 1",
"description": "This is an algorithm error",
"category": "ALGORITHM",
"created": "2015-03-11T00:00:00Z",
"last_modified": "2015-03-11T00:00:00Z"
}
|
Error Details | ||
---|---|---|
Returns the details for an error with the given id. | ||
|
||
Successful Response | ||
Status | 200 OK | |
Content Type | application/json | |
JSON Fields | ||
id | Integer | The unique identifier of the model. |
name | String | The stable name of the error used for queries. |
title | String | The human readable display name of the error. |
description | String | A longer description of the error. |
category | String | The category of the error. Choices: [SYSTEM, ALGORITHM, DATA]. |
created | ISO-8601 Datetime | When the associated database model was initially created. |
last_modified | ISO-8601 Datetime | When the associated database model was last saved. |
{
"id": 1,
"name": "unknown",
"title": "Unknown",
"description": "The error that caused the failure is unknown.",
"category": "SYSTEM",
"created": "2015-03-11T00:00:00Z",
"last_modified": "2015-03-11T00:00:00Z"
}
|
Edit Error | |||
---|---|---|---|
Edits an existing error | |||
|
|||
Content Type | application/json | ||
JSON Fields | |||
title | String | Optional | The human readable display name of the error. |
description | String | Optional | A longer description of the error. |
category | String | Optional | The category of the error. Choices: [SYSTEM, ALGORITHM, DATA]. |
{
"title": "My Error",
"description": "An edited error description.",
"category": "ALGORITHM"
}
|
|||
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
JSON Object | All fields are the same as the error details model. (See Error Details) | ||
{
"id": 100,
"name": "my-error",
"title": "My Error",
"description": "An edited error description.",
"category": "ALGORITHM",
"created": "2015-03-11T00:00:00Z",
"last_modified": "2015-03-11T00:00:00Z"
}
|
Ingest Services¶
These services provide access to information about ingested files processed by the system.
Ingest List | |||
---|---|---|---|
Returns a list of all ingests. | |||
GET /ingests/ | |||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
order | String | Optional | One or more fields to use when ordering the results. Duplicate it to multi-sort, (ex: order=status&order=created). Nested objects require a delimiter (ex: order=source_file__created). Prefix fields with a dash to reverse the sort, (ex: order=-status). |
status | String | Optional | Return only ingests with a status matching this string. Choices: [TRANSFERRING, TRANSFERRED, DEFERRED, INGESTING, INGESTED, ERRORED, DUPLICATE]. |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.id | Integer | The unique identifier of the model. Can be passed to the details API call. (See Ingest Details) | |
.file_name | String | The name of the file being ingested. | |
.strike | JSON Object | The strike process that triggered the ingest. (See Strike Details) | |
.status | String | The current status of the ingest. Choices: [TRANSFERRING, TRANSFERRED, DEFERRED, INGESTING, INGESTED, ERRORED, DUPLICATE]. | |
.bytes_transferred | Integer | The total number of bytes transferred so far. | |
.transfer_started | ISO-8601 Datetime | When the transfer was started. | |
.transfer_ended | ISO-8601 Datetime | When the transfer ended. | |
.media_type | String | The IANA media type of the file. | |
.file_size | Integer | The size of the file in bytes. | |
.data_type | Array | A list of string data type “tags” for the file. | |
.ingest_started | ISO-8601 Datetime | When the ingest was started. | |
.ingest_ended | ISO-8601 Datetime | When the ingest ended. | |
.source_file | JSON Object | A reference to the source file that was stored by this ingest. (See Source File Details) | |
.created | ISO-8601 Datetime | When the associated database model was initially created. | |
.last_modified | ISO-8601 Datetime | When the associated database model was last saved. | |
{
"count": 42,
"next": null,
"previous": null,
"results": [
{
"id": 14,
"file_name": "file_name.txt",
"strike": {
"id": 1,
"job": {
"id": 2
}
},
"status": "INGESTED",
"bytes_transferred": 1234,
"transfer_started": "2015-09-10T14:48:08.920Z",
"transfer_ended": "2015-09-10T14:48:08.956Z",
"media_type": "text/plain",
"file_size": 1234,
"data_type": [],
"ingest_started": "2015-09-10T15:24:53.503Z",
"ingest_ended": "2015-09-10T15:24:53.987Z",
"source_file": {
"id": 1,
"workspace": {
"id": 1,
"name": "Raw Source"
},
"file_name": "file_name.txt",
"media_type": "text/plain",
"file_size": 1234,
"data_type": [],
"is_deleted": false,
"uuid": "c8928d9183fc99122948e7840ec9a0fd",
"url": "http://host.com/file_name.txt",
"created": "2015-09-10T15:24:53.962Z",
"deleted": null,
"data_started": "2015-09-10T14:36:56Z",
"data_ended": "2015-09-10T14:37:01Z",
"geometry": null,
"center_point": null,
"meta_data": {...},
"last_modified": "2015-09-10T15:25:03.797Z",
"is_parsed": true,
"parsed": "2015-09-10T15:25:03.796Z"
},
"created": "2015-09-10T15:24:47.412Z",
"last_modified": "2015-09-10T15:24:53.987Z"
},
...
]
}
|
Ingest Details | ||
---|---|---|
Returns a specific ingest and all its related model information. | ||
|
||
Successful Response | ||
Status | 200 OK | |
Content Type | application/json | |
JSON Fields | ||
id | Integer | The unique identifier of the model. |
file_name | String | The name of the file being ingested. |
strike | JSON Object | The strike process that triggered the ingest. (See Strike Details) |
status | String | The current status of the ingest. Choices: [TRANSFERRING, TRANSFERRED, DEFERRED, INGESTING, INGESTED, ERRORED, DUPLICATE]. |
bytes_transferred | Integer | The total number of bytes transferred so far. |
transfer_started | ISO-8601 Datetime | When the transfer was started. |
transfer_ended | ISO-8601 Datetime | When the transfer ended. |
media_type | String | The IANA media type of the file. |
file_size | Integer | The size of the file in bytes. |
data_type | Array | A list of string data type “tags” for the file. |
ingest_started | ISO-8601 Datetime | When the ingest was started. |
ingest_ended | ISO-8601 Datetime | When the ingest ended. |
source_file | JSON Object | A reference to the source file that was stored by this ingest. (See Source File Details) |
created | ISO-8601 Datetime | When the associated database model was initially created. |
last_modified | ISO-8601 Datetime | When the associated database model was last saved. |
transfer_path | String | The absolute path of the destination where the file is being transferred. |
file_path | String | The relative path for where the file will be stored in the workspace. |
ingest_path | String | The absolute path of the file when it is ready to be ingested. |
{
"id": 14,
"file_name": "file_name.txt",
"strike": {
"id": 1,
"job": {
"id": 2,
"job_type": {
"id": 2,
"name": "scale-strike",
"version": "1.0",
"title": "Scale Strike",
"description": "Monitors a directory for incoming files to ingest",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": true,
"is_long_running": true,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013"
},
"job_type_rev": {
"id": 2
},
"event": {
"id": 2
},
"error": null,
"status": "RUNNING",
"priority": 5,
"num_exes": 1
},
"configuration": {
"transfer_suffix": "_tmp",
"mount": "host:/transfer",
"version": "1.0",
"mount_on": "/mounts/transfer",
"files_to_ingest": [
{
"workspace_path": "/workspace",
"data_types": [],
"filename_regex": "*.txt",
"workspace_name": "rs"
}
]
},
"created": "2015-09-10T15:24:42.896Z",
"last_modified": "2015-09-10T15:24:42.935Z"
},
"status": "INGESTED",
"bytes_transferred": 1234,
"transfer_started": "2015-09-10T14:48:08.920Z",
"transfer_ended": "2015-09-10T14:48:08.956Z",
"media_type": "text/plain",
"file_size": 1234,
"data_type": [],
"ingest_started": "2015-09-10T15:24:53.503Z",
"ingest_ended": "2015-09-10T15:24:53.987Z",
"source_file": {
"id": 1,
"workspace": {
"id": 1,
"name": "Raw Source"
},
"file_name": "file_name.txt",
"media_type": "text/plain",
"file_size": 1234,
"data_type": [],
"is_deleted": false,
"uuid": "c8928d9183fc99122948e7840ec9a0fd",
"url": "http://host.com/file_name.txt",
"created": "2015-09-10T15:24:53.962Z",
"deleted": null,
"data_started": "2015-09-10T14:36:56Z",
"data_ended": "2015-09-10T14:37:01Z",
"geometry": null,
"center_point": null,
"meta_data": {...},
"last_modified": "2015-09-10T15:25:03.797Z",
"is_parsed": true,
"parsed": "2015-09-10T15:25:03.796Z"
},
"created": "2015-09-10T15:24:47.412Z",
"last_modified": "2015-09-10T15:24:53.987Z",
"transfer_path": "/mounts/transfer/file_name.txt",
"file_path": "path/file_name.txt",
"ingest_path": "/mounts/transfer/ingesting/file_name.txt"
}
|
Ingest Status | |||
---|---|---|---|
Returns status summary information (counts, file sizes) for completed ingests grouped into 1 hour time slots. NOTE: Time range must be within a one month period (31 days). | |||
GET /ingests/status/ | |||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). Defaults to the past 1 week. |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
use_ingest_time | Boolean | Optional | Whether to group counts by ingest time or data time. Ingest time is when the strike process registered the file. Data time is the time when the data was collected by a sensor. Defaults to False (data time). |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.strike | JSON Object | The strike process that triggered the ingest. (See Strike Details) | |
.most_recent | ISO-8601 Datetime | The date/time when the strike process last completed an ingest. | |
.files | Integer | The total number of files ingested by the strike process. | |
.size | Integer | The total size of files ingested by the strike process in bytes. | |
.values | Array | A list of ingest statistics grouped into 1 hour time slots. | |
..time | ISO-8601 Datetime | The date/time of the 1 hour time slot being counted. | |
..files | Integer | The number of files ingested by the strike process within the time slot. | |
..size | Integer | The size of files ingested by the strike process in bytes within the time slot. | |
{
"count": 2,
"next": null,
"previous": null,
"results": [
{
"strike": {
"id": 1,
"name": "my-strike",
"title": "My Strike Processor",
"description": "This Strike process handles the data feed",
"job": {
"id": 4,
"job_type": {
"id": 2,
"name": "scale-strike",
"version": "1.0",
"title": "Scale Strike",
"description": "Monitors a directory for incoming source files to ingest",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": true,
"is_long_running": true,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013"
},
"event": {
"id": 5
},
"error": null,
"status": "RUNNING",
"priority": 5,
"num_exes": 36
},
"created": "2015-10-05T17:35:46.690Z",
"last_modified": "2015-10-05T17:35:46.740Z"
},
"most_recent": "2015-10-21T21:15:56.522Z",
"files": 1234,
"size": 12345678900000,
"values": [
{
"time": "2015-10-21T00:00:00Z",
"files": 10,
"size": 123456789
},
...
]
}
]
}
|
Job Services¶
These services provide access to information about “all”, “currently running” and “previously finished” jobs.
Job List | |||
---|---|---|---|
Returns a list of all jobs. | |||
GET /jobs/ | |||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
order | String | Optional | One or more fields to use when ordering the results. Duplicate it to multi-sort, (ex: order=name&order=version). Prefix fields with a dash to reverse the sort, (ex: order=-name). |
status | String | Optional | Return only jobs with a status matching this string. Choices: [QUEUED, RUNNING, FAILED, COMPLETED, CANCELED]. |
job_id | Integer | Optional | Return only jobs with a given identifier. Duplicate it to filter by multiple values. |
job_type_id | Integer | Optional | Return only jobs with a given job type identifier. Duplicate it to filter by multiple values. |
job_type_name | String | Optional | Return only jobs with a given job type name. Duplicate it to filter by multiple values. |
job_type_category | String | Optional | Return only jobs with a given job type category. Duplicate it to filter by multiple values. |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.id | Integer | The unique identifier of the model. Can be passed to the details API call. (See Job Details) | |
.job_type | JSON Object | The job type that is associated with the job. (See Job Type Details) | |
.job_type_rev | JSON Object | The job type revision that is associated with the job. This represents the definition at the time the job was scheduled. (See Job Type Revision Details) | |
.event | JSON Object | The trigger event that is associated with the job. (See Trigger Event Details) | |
.error | JSON Object | The error that is associated with the job. (See Error Details) | |
.status | String | The current status of the job. Choices: [QUEUED, RUNNING, FAILED, COMPLETED, CANCELED]. | |
.priority | Integer | The priority of the job. | |
.num_exes | Integer | The number of executions this job has had. | |
.timeout | Integer | The maximum amount of time this job can run before being killed (in seconds). | |
.max_tries | Integer | The maximum number of times to attempt this job when failed (minimum one). | |
.cpus_required | Decimal | The number of CPUs needed for a job of this type. | |
.mem_required | Decimal | The amount of RAM in MiB needed for a job of this type. | |
.disk_in_required | Decimal | The amount of disk space in MiB required for input files for this job. | |
.disk_out_required | Decimal | The amount of disk space in MiB required for output files for this job. | |
.created | ISO-8601 Datetime | When the associated database model was initially created. | |
.queued | ISO-8601 Datetime | When the job was added to the queue to be run when resources are available. | |
.started | ISO-8601 Datetime | When the job started running. | |
.ended | ISO-8601 Datetime | When the job stopped running, which could be due to success or failure. | |
.last_status_change | ISO-8601 Datetime | When the status of the job was last changed. | |
.last_modified | ISO-8601 Datetime | When the associated database model was last saved. | |
{
"count": 68,
"next": null,
"previous": null,
"results": [
{
"id": 3,
"job_type": {
"id": 1,
"name": "scale-ingest",
"version": "1.0",
"title": "Scale Ingest",
"description": "Ingests a source file into a workspace",
"is_system": true,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013"
},
"job_type_rev": {
"id": 5,
"job_type": {
"id": 1
},
"revision_num": 1
},
"event": {
"id": 3,
"type": "STRIKE_TRANSFER",
"rule": null,
"occurred": "2015-08-28T17:57:24.261Z"
},
"error": null,
"status": "COMPLETED",
"priority": 10,
"num_exes": 1,
"timeout": 1800,
"max_tries": 3,
"cpus_required": 1.0,
"mem_required": 64.0,
"disk_in_required": 0.0,
"disk_out_required": 64.0,
"created": "2015-08-28T17:55:41.005Z",
"queued": "2015-08-28T17:56:41.005Z",
"started": "2015-08-28T17:57:41.005Z",
"ended": "2015-08-28T17:58:41.005Z",
"last_status_change": "2015-08-28T17:58:45.906Z",
"last_modified": "2015-08-28T17:58:46.001Z"
},
...
]
}
|
Job Details | ||
---|---|---|
Returns a specific job and all its related model information including executions, recipes, and products. | ||
|
||
Successful Response | ||
Status | 200 OK | |
Content Type | application/json | |
JSON Fields | ||
id | Integer | The unique identifier of the model. |
job_type | JSON Object | The job type that is associated with the count. (See Job Type Details) |
.job_type_rev | JSON Object | The job type revision that is associated with the job. This represents the definition at the time the job was scheduled. (See Job Type Revision Details) |
event | JSON Object | The trigger event that is associated with the count. (See Trigger Event Details) |
error | JSON Object | The error that is associated with the count. (See Error Details) |
status | String | The current status of the job. |
priority | Integer | The priority of the job. |
num_exes | Integer | The number of executions this job has had. |
timeout | Integer | The maximum amount of time this job can run before being killed (in seconds). |
max_tries | Integer | The maximum number of times to attempt this job when failed (minimum one). |
cpus_required | Decimal | The number of CPUs needed for a job of this type. |
mem_required | Decimal | The amount of RAM in MiB needed for a job of this type. |
disk_in_required | Decimal | The amount of disk space in MiB required for input files for this job. |
disk_out_required | Decimal | The amount of disk space in MiB required for output files for this job. |
created | ISO-8601 Datetime | When the associated database model was initially created. |
queued | ISO-8601 Datetime | When the job was added to the queue to be run when resources are available. |
started | ISO-8601 Datetime | When the job started running. |
ended | ISO-8601 Datetime | When the job stopped running, which could be due to success or failure. |
last_status_change | ISO-8601 Datetime | When the status of the job was last changed. |
last_modified | ISO-8601 Datetime | When the associated database model was last saved. |
data | JSON Object | An interface description for all the job input and output files. (See Job Data Specification Version 1.0) |
results | JSON Object | An interface description for all the job results meta-data. (See Job Results Specification Version 1.1) |
recipes | Array | A list of all recipes associated with the job. (See Recipe Details) |
job_exes | Array | A list of all job executions associated with the job. (See Job Execution Details) |
inputs | Array | A list of job interface inputs merged with their respective job data values. |
.name | String | The name of the input as defined by the job type interface. (See Job Interface Specification Version 1.0) |
.type | String | The type of the input as defined by teh job type interface. (See Job Interface Specification Version 1.0) |
.value | Various | The actual value of the input, which can vary depending on the type. Simple property inputs will include primitive values, whereas the file or files type will include a full JSON representation of a Scale file object. (See Scale File Details) |
outputs | Array | A list of job interface outputs merged with their respective job result values. |
.name | String | The name of the output as defined by the job type interface. (See Job Interface Specification Version 1.0) |
.type | String | The type of the output as defined by teh job type interface. (See Job Interface Specification Version 1.0) |
.value | Various | The actual value of the output, which can vary depending on the type. A file or files type will include a full JSON representation of a Product file object. (See Product Details) |
{
"id": 15096,
"job_type": {
"id": 8,
"name": "kml-footprint",
"version": "1.0.0",
"title": "KML Footprint",
"description": "Creates a KML representation of the data",
"is_system": false,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f0ac",
"uses_docker": false,
"docker_privileged": false,
"docker_image": null,
"priority": 2,
"timeout": 600,
"max_tries": 1,
"cpus_required": 0.5,
"mem_required": 128.0,
"disk_out_const_required": 0.0,
"disk_out_mult_required": 0.0,
"created": "2015-06-01T00:00:00Z",
"archived": null,
"paused": null,
"last_modified": "2015-06-01T00:00:00Z"
},
"job_type_rev": {
"id": 5,
"job_type": {
"id": 8
},
"revision_num": 1,
"interface": {
"input_data": [
{
"type": "file",
"name": "input_file"
}
],
"output_data": [
{
"media_type": "application/vnd.google-earth.kml+xml",
"type": "file",
"name": "output_file"
}
],
"version": "1.0",
"command": "/usr/local/bin/python2.7 /app/parser/manage.py create_footprint_kml",
"command_arguments": "${input_file} ${job_output_dir}"
},
"created": "2015-11-06T00:00:00Z"
},
"event": {
"id": 10278,
"type": "PARSE",
"rule": {
"id": 8,
"type": "PARSE",
"is_active": true,
"created": "2015-08-28T18:31:29.282Z",
"archived": null,
"last_modified": "2015-08-28T18:31:29.282Z"
},
"occurred": "2015-09-01T17:27:31.467Z"
},
"error": null,
"status": "COMPLETED",
"priority": 210,
"num_exes": 1,
"timeout": 1800,
"max_tries": 3,
"cpus_required": 1.0,
"mem_required": 15360.0,
"disk_in_required": 2.0,
"disk_out_required": 16.0,
"created": "2015-08-28T17:55:41.005Z",
"queued": "2015-08-28T17:56:41.005Z",
"started": "2015-08-28T17:57:41.005Z",
"ended": "2015-08-28T17:58:41.005Z",
"last_status_change": "2015-08-28T17:58:45.906Z",
"last_modified": "2015-08-28T17:58:46.001Z",
"data": {
"input_data": [
{
"name": "input_file",
"file_id": 8480
}
],
"version": "1.0",
"output_data": [
{
"name": "output_file",
"workspace_id": 2
}
]
},
"results": {
"output_data": [
{
"name": "output_file",
"file_id": 8484
}
],
"version": "1.0"
},
"recipes": [
{
"id": 4832,
"recipe_type": {
"id": 6,
"name": "Recipe",
"version": "1.0.0",
"description": "Recipe description"
},
"event": {
"id": 7,
"type": "PARSE",
"rule": {
"id": 2
},
"occurred": "2015-08-28T17:58:45.280Z"
},
"created": "2015-09-01T20:32:20.912Z",
"completed": "2015-09-01T20:35:20.912Z",
"last_modified": "2015-09-01T20:35:20.912Z"
}
],
"job_exes": [
{
"id": 14552,
"status": "COMPLETED",
"command_arguments": "${input_file} ${job_output_dir}",
"timeout": 1800,
"pre_started": "2015-09-01T17:27:32.435Z",
"pre_completed": "2015-09-01T17:27:34.346Z",
"pre_exit_code": null,
"job_started": "2015-09-01T17:27:42.437Z",
"job_completed": "2015-09-01T17:27:46.762Z",
"job_exit_code": null,
"post_started": "2015-09-01T17:27:47.246Z",
"post_completed": "2015-09-01T17:27:49.461Z",
"post_exit_code": null,
"created": "2015-09-01T17:27:31.753Z",
"queued": "2015-09-01T17:27:31.716Z",
"started": "2015-09-01T17:27:32.022Z",
"ended": "2015-09-01T17:27:49.461Z",
"last_modified": "2015-09-01T17:27:49.606Z",
"job": {
"id": 15586
},
"node": {
"id": 1
},
"error": null
}
],
"inputs": [
{
"name": "input_file",
"type": "file",
"value": {
"id": 2,
"workspace": {
"id": 1,
"name": "Raw Source"
},
"file_name": "input_file.txt",
"media_type": "text/plain",
"file_size": 1234,
"data_type": [],
"is_deleted": false,
"uuid": "c8928d9183fc99122948e7840ec9a0fd",
"url": "http://host.com/input_file.txt",
"created": "2015-09-10T15:24:53.962Z",
"deleted": null,
"data_started": "2015-09-10T14:50:49Z",
"data_ended": "2015-09-10T14:51:05Z",
"geometry": null,
"center_point": null,
"meta_data": {...}
"last_modified": "2015-09-10T15:25:02.808Z"
}
}
],
"outputs": [
{
"name": "output_file",
"type": "file",
"value": {
"id": 8484,
"workspace": {
"id": 2,
"name": "Products"
},
"file_name": "file.kml",
"media_type": "application/vnd.google-earth.kml+xml",
"file_size": 1234,
"data_type": [],
"is_deleted": false,
"uuid": "c8928d9183fc99122948e7840ec9a0fd",
"url": "http://host.com/file/path/my_file.kml",
"created": "2015-09-01T17:27:48.477Z",
"deleted": null,
"data_started": null,
"data_ended": null,
"geometry": null,
"center_point": null,
"meta_data": {},
"last_modified": "2015-09-01T17:27:49.639Z",
"is_operational": true,
"is_published": true,
"published": "2015-09-01T17:27:49.461Z",
"unpublished": null,
"job_type": {
"id": 8
},
"job": {
"id": 35
},
"job_exe": {
"id": 19
}
}
}
]
}
|
Update Job | ||
---|---|---|
Update the details of a job. | ||
|
||
Content Type | application/json | |
JSON Fields | ||
status | String |
|
Successful Response | ||
Status | 200 OK | |
Content Type | application/json | |
Response format is identical to GET but contains the updated data. | ||
Error Responses | ||
Status | 400 BAD REQUEST | |
Content Type | text/plain | |
Unexpected fields were specified. An error message lists them. Or no fields or invalid values were specified. | ||
Status | 404 NOT FOUND | |
Content Type | text/plain | |
The specified job or associated job executions (if applicable) were not found in the database. | ||
Status | 500 SERVER ERROR | |
Content Type | text/plain | |
A miscellaneous (and rare) server error or database timing error occurred. Repeating the request may result in success. The exact error reason will appear in the response content. |
Job Updates | |||
---|---|---|---|
Returns a list of jobs with associated input files that changed status in the given time range. | |||
GET /jobs/updates/ | |||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
order | String | Optional | One or more fields to use when ordering the results. Duplicate it to multi-sort, (ex: order=name&order=version). Prefix fields with a dash to reverse the sort, (ex: order=-name). |
status | String | Optional | Return only jobs with a status matching this string. Choices: [QUEUED, RUNNING, FAILED, COMPLETED, CANCELED]. |
job_type_id | Integer | Optional | Return only jobs with a given job type identifier. Duplicate it to filter by multiple values. |
job_type_name | String | Optional | Return only jobs with a given job type name. Duplicate it to filter by multiple values. |
job_type_category | String | Optional | Return only jobs with a given job type category. Duplicate it to filter by multiple values. |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.id | Integer | The unique identifier of the model. Can be passed to the details API call. (See Job Details) | |
.job_type | JSON Object | The job type that is associated with the job. (See Job Type Details) | |
.job_type_rev | JSON Object | The job type revision that is associated with the job. This represents the definition at the time the job was scheduled. (See Job Type Revision Details) | |
.event | JSON Object | The trigger event that is associated with the job. (See Trigger Event Details) | |
.error | JSON Object | The error that is associated with the job. (See Error Details) | |
.status | String | The current status of the job. Choices: [QUEUED, RUNNING, FAILED, COMPLETED, CANCELED]. | |
.priority | Integer | The priority of the job. | |
.num_exes | Integer | The number of executions this job has had. | |
.timeout | Integer | The maximum amount of time this job can run before being killed (in seconds). | |
.max_tries | Integer | The maximum number of times to attempt this job when failed (minimum one). | |
.cpus_required | Decimal | The number of CPUs needed for a job of this type. | |
.mem_required | Decimal | The amount of RAM in MiB needed for a job of this type. | |
.disk_in_required | Decimal | The amount of disk space in MiB required for input files for this job. | |
.disk_out_required | Decimal | The amount of disk space in MiB required for output files for this job. | |
.created | ISO-8601 Datetime | When the associated database model was initially created. | |
.queued | ISO-8601 Datetime | When the job was added to the queue to be run when resources are available. | |
.started | ISO-8601 Datetime | When the job started running. | |
.ended | ISO-8601 Datetime | When the job stopped running, which could be due to success or failure. | |
.last_status_change | ISO-8601 Datetime | When the status of the job was last changed. | |
.last_modified | ISO-8601 Datetime | When the associated database model was last saved. | |
.input_files | JSON Object | A list of files that the job used as input. (See Scale File Details) | |
{
"count": 68,
"next": null,
"previous": null,
"results": [
{
"id": 3,
"job_type": {
"id": 1,
"name": "scale-ingest",
"version": "1.0",
"title": "Scale Ingest",
"description": "Ingests a source file into a workspace",
"is_system": true,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013"
},
"job_type_rev": {
"id": 5,
"job_type": {
"id": 1
},
"revision_num": 1
},
"event": {
"id": 3,
"type": "STRIKE_TRANSFER",
"rule": null,
"occurred": "2015-08-28T17:57:24.261Z"
},
"error": null,
"status": "COMPLETED",
"priority": 10,
"num_exes": 1,
"timeout": 1800,
"max_tries": 3,
"cpus_required": 1.0,
"mem_required": 64.0,
"disk_in_required": 0.0,
"disk_out_required": 64.0,
"created": "2015-08-28T17:55:41.005Z",
"queued": "2015-08-28T17:56:41.005Z",
"started": "2015-08-28T17:57:41.005Z",
"ended": "2015-08-28T17:58:41.005Z",
"last_status_change": "2015-08-28T17:58:45.906Z",
"last_modified": "2015-08-28T17:58:46.001Z",
"input_files": [
{
"id": 2,
"workspace": {
"id": 1,
"name": "Raw Source"
},
"file_name": "input_file.txt",
"media_type": "text/plain",
"file_size": 1234,
"data_type": [],
"is_deleted": false,
"uuid": "c8928d9183fc99122948e7840ec9a0fd",
"url": "http://host.com/input_file.txt",
"created": "2015-09-10T15:24:53.962Z",
"deleted": null,
"data_started": "2015-09-10T14:50:49Z",
"data_ended": "2015-09-10T14:51:05Z",
"geometry": null,
"center_point": null,
"meta_data": {...}
"last_modified": "2015-09-10T15:25:02.808Z"
}
]
},
...
]
}
|
Job with Execution List | |||
---|---|---|---|
Returns a list of all jobs with their latest execution. | |||
GET /jobs/executions/ | |||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
order | String | Optional | One or more fields to use when ordering the results. Duplicate it to multi-sort, (ex: order=name&order=version). Prefix fields with a dash to reverse the sort, (ex: order=-name). |
status | String | Optional | Return only jobs with a status matching this string. Choices: [QUEUED, RUNNING, FAILED, COMPLETED, CANCELED]. |
job_type_id | Integer | Optional | Return only jobs with a given job type identifier. Duplicate it to filter by multiple values. |
job_type_name | String | Optional | Return only jobs with a given job type name. Duplicate it to filter by multiple values. |
job_type_category | String | Optional | Return only jobs with a given job type category. Duplicate it to filter by multiple values. |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.id | Integer | The unique identifier of the model. Can be passed to the details API call. (See Job Details) | |
.job_type | JSON Object | The job type that is associated with the count. (See Job Type Details) | |
.event | JSON Object | The trigger event that is associated with the count. (See Trigger Event Details) | |
.error | JSON Object | The error that is associated with the count. (See Error Details) | |
.status | String | The current status of the job. Choices: [QUEUED, RUNNING, FAILED, COMPLETED, CANCELED]. | |
.priority | Integer | The priority of the job. | |
.num_exes | Integer | The number of executions this job has had. | |
.timeout | Integer | The maximum amount of time this job can run before being killed (in seconds). | |
.max_tries | Integer | The maximum number of times to attempt this job when failed (minimum one). | |
.cpus_required | Decimal | The number of CPUs needed for a job of this type. | |
.mem_required | Decimal | The amount of RAM in MiB needed for a job of this type. | |
.disk_in_required | Decimal | The amount of disk space in MiB required for input files for this job. | |
.disk_out_required | Decimal | The amount of disk space in MiB required for output files for this job. | |
.created | ISO-8601 Datetime | When the associated database model was initially created. | |
.queued | ISO-8601 Datetime | When the job was added to the queue to be run when resources are available. | |
.started | ISO-8601 Datetime | When the job started running. | |
.ended | ISO-8601 Datetime | When the job stopped running, which could be due to success or failure. | |
.last_status_change | ISO-8601 Datetime | When the status of the job was last changed. | |
.last_modified | ISO-8601 Datetime | When the associated database model was last saved. | |
.latest_job_exe | JSON Object | The most recent execution of the job. (See Job Execution Details) | |
{
"count": 68,
"next": null,
"previous": null,
"results": [
{
"id": 3,
"job_type": {
"id": 1,
"name": "scale-ingest",
"version": "1.0",
"title": "Scale Ingest",
"description": "Ingests a source file into a workspace",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": true,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013"
},
"job_type_rev": {
"id": 5,
"job_type": {
"id": 1
},
"revision_num": 1
},
"event": {
"id": 3,
"type": "STRIKE_TRANSFER",
"rule": null,
"occurred": "2015-08-28T17:57:24.261Z"
},
"error": null,
"status": "COMPLETED",
"priority": 10,
"num_exes": 1,
"timeout": 1800,
"max_tries": 3,
"cpus_required": 1.0,
"mem_required": 64.0,
"disk_in_required": 0.0,
"disk_out_required": 64.0,
"created": "2015-08-28T17:55:41.005Z",
"queued": "2015-08-28T17:56:41.005Z",
"started": "2015-08-28T17:57:41.005Z",
"ended": "2015-08-28T17:58:41.005Z",
"last_status_change": "2015-08-28T17:58:45.906Z",
"last_modified": "2015-08-28T17:58:46.001Z",
"latest_job_exe": {
"id": 3,
"status": "COMPLETED",
"command_arguments": "",
"timeout": 1800,
"pre_started": null,
"pre_completed": null,
"pre_exit_code": null,
"job_started": "2015-08-28T17:57:44.703Z",
"job_completed": "2015-08-28T17:57:45.906Z",
"job_exit_code": null,
"post_started": null,
"post_completed": null,
"post_exit_code": null,
"created": "2015-08-28T17:57:41.033Z",
"queued": "2015-08-28T17:57:41.010Z",
"started": "2015-08-28T17:57:44.494Z",
"ended": "2015-08-28T17:57:45.906Z",
"last_modified": "2015-08-28T17:57:45.992Z",
"job": {
"id": 4
},
"node": {
"id": 2
},
"error": null
}
},
...
]
}
|
Job Execution Services¶
These services provide access to information about “all”, “currently running” and “previously finished” job executions.
Job Execution List | |||
---|---|---|---|
Returns a list of all job executions. | |||
GET /job-executions/ | |||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
order | String | Optional | One or more fields to use when ordering the results. Duplicate it to multi-sort, (ex: order=status&order=created). Nested objects require a delimiter (ex: order=job_type__name). Prefix fields with a dash to reverse the sort, (ex: order=-status). |
status | String | Optional | Return only executions with a status matching this string. Choices: [QUEUED, RUNNING, FAILED, COMPLETED, CANCELED]. |
job_type_id | Integer | Optional | Return only jobs with a given job type identifier. Duplicate it to filter by multiple values. |
job_type_name | String | Optional | Return only jobs with a given job type name. Duplicate it to filter by multiple values. |
job_type_category | String | Optional | Return only jobs with a given job type category. Duplicate it to filter by multiple values. |
node_id | Integer | Optional | Return only executions that ran on a given node. Duplicate it to filter by multiple values. |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.id | Integer | The unique identifier of the model. Can be passed to the details API call. (See Job Execution Details) | |
.status | String | The status of the job execution. Choices: [QUEUED, RUNNING, FAILED, COMPLETED, CANCELED]. | |
.command_arguments | String | The argument string to execute on the command line for this job execution. This field is populated when the job execution is scheduled to run on a node and is updated when any needed pre-job steps are run. | |
.timeout | Integer | The maximum amount of time this job can run before being killed (in seconds). | |
.pre_started | ISO-8601 Datetime | When the pre-job steps were started on a node. | |
.pre_completed | ISO-8601 Datetime | When the pre-job steps were completed on a node. | |
.pre_exit_code | Integer | The exit code of the pre-steps job process for this job execution. | |
.job_started | ISO-8601 Datetime | When the actual job started running on a node. | |
.job_completed | ISO-8601 Datetime | When the actual job completed running on a node. | |
.job_exit_code | Integer | The exit code of the main job process for this job execution. | |
.post_started | ISO-8601 Datetime | When the post-job steps were started on a node. | |
.post_completed | ISO-8601 Datetime | When the post-job steps were completed on a node. | |
.post_exit_code | Integer | The exit code of the post-steps job process for this job execution. | |
.created | ISO-8601 Datetime | When the associated database model was initially created. | |
.queued | ISO-8601 Datetime | When the job was added to the queue for this run and went to QUEUED status. | |
.started | ISO-8601 Datetime | When the job was scheduled and went to RUNNING status. | |
.ended | ISO-8601 Datetime | When the job execution ended. (FAILED, COMPLETED, or CANCELED) | |
.last_modified | ISO-8601 Datetime | When the associated database model was last saved. | |
.job | JSON Object | The job that is associated with the execution. (See Job Details) | |
.node | JSON Object | The node that ran the execution. (See Node Details) | |
.error | JSON Object | The last error that was recorded for the execution. (See Error Details) | |
{
"count": 57,
"next": null,
"previous": null,
"results": [
{
"id": 3,
"status": "COMPLETED",
"command_arguments": "",
"timeout": 1800,
"pre_started": null,
"pre_completed": null,
"pre_exit_code": null,
"job_started": "2015-08-28T17:57:44.703Z",
"job_completed": "2015-08-28T17:57:45.906Z",
"job_exit_code": null,
"post_started": null,
"post_completed": null,
"post_exit_code": null,
"created": "2015-08-28T17:57:41.033Z",
"queued": "2015-08-28T17:57:41.010Z",
"started": "2015-08-28T17:57:44.494Z",
"ended": "2015-08-28T17:57:45.906Z",
"last_modified": "2015-08-28T17:57:45.992Z",
"job": {
"id": 3,
"job_type": {
"id": 1,
"name": "scale-ingest",
"version": "1.0",
"title": "Scale Ingest",
"description": "Ingests a source file into a workspace",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": true,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013"
},
"job_type_rev": {
"id": 2
},
"event": {
"id": 3
},
"error": null,
"status": "COMPLETED",
"priority": 10,
"num_exes": 1
},
"node": {
"id": 1,
"hostname": "machine.com",
"port": 5051,
"slave_id": "20150821-123454-1683014024-5050-8216-S2"
},
"error": null
},
...
]
}
|
Job Execution Details | ||
---|---|---|
Returns a specific job execution and all its related model information including job, node, environment, and results. | ||
|
||
Successful Response | ||
Status | 200 OK | |
Content Type | application/json | |
JSON Fields | ||
id | Integer | The unique identifier of the model. Can be passed to the details API call. (See Job Execution Details) |
status | String | The status of the job execution. Choices: [QUEUED, RUNNING, FAILED, COMPLETED, CANCELED]. |
command_arguments | String | The argument string to execute on the command line for this job execution. This field is populated when the job execution is scheduled to run on a node and is updated when any needed pre-job steps are run. |
timeout | Integer | The maximum amount of time this job can run before being killed (in seconds). |
pre_started | ISO-8601 Datetime | When the pre-job steps were started on a node. |
pre_completed | ISO-8601 Datetime | When the pre-job steps were completed on a node. |
pre_exit_code | Integer | The exit code of the pre-steps job process for this job execution. |
job_started | ISO-8601 Datetime | When the actual job started running on a node. |
job_completed | ISO-8601 Datetime | When the actual job completed running on a node. |
job_exit_code | Integer | The exit code of the main job process for this job execution. |
post_started | ISO-8601 Datetime | When the post-job steps were started on a node. |
post_completed | ISO-8601 Datetime | When the post-job steps were completed on a node. |
post_exit_code | Integer | The exit code of the post-steps job process for this job execution. |
created | ISO-8601 Datetime | When the associated database model was initially created. |
queued | ISO-8601 Datetime | When the job was added to the queue for this run and went to QUEUED status. |
started | ISO-8601 Datetime | When the job was scheduled and went to RUNNING status. |
ended | ISO-8601 Datetime | When the job execution ended. (FAILED, COMPLETED, or CANCELED) |
last_modified | ISO-8601 Datetime | When the associated database model was last saved. |
job | JSON Object | The job that is associated with the execution. (See Job Details) |
node | JSON Object | The node that ran the execution. (See Node Details) |
error | JSON Object | The last error that was recorded for the execution. (See Error Details) |
environment | JSON Object | An interface description for the environment the job execution executed in. (See Job Environment Specification Version 1.0) |
cpus_scheduled | Decimal | The number of CPUs scheduled for the execution. |
mem_scheduled | Decimal | The amount of RAM in MiB scheduled for the execution. |
disk_in_scheduled | Decimal | The amount of disk space in MiB scheduled for input files for the execution. |
disk_out_scheduled | Decimal | The amount of disk space in MiB scheduled for output files for the execution. |
disk_total_scheduled | Decimal | The total amount of disk space in MiB scheduled for the execution. |
results | JSON Object | An interface description for all the possible job results meta-data. (See Job Results Specification Version 1.1) |
current_stdout_url | URL | The URL of the standard output log for the execution. |
current_stderr_url | URL | The URL of the standard error log for the job execution. |
results_manifest | JSON Object | An interface description for all the actual job results meta-data. (See Job Results Specification Version 1.1) |
{
"id": 3,
"status": "COMPLETED",
"command_arguments": "",
"timeout": 1800,
"pre_started": null,
"pre_completed": null,
"pre_exit_code": null,
"job_started": "2015-08-28T17:57:44.703Z",
"job_completed": "2015-08-28T17:57:45.906Z",
"job_exit_code": null,
"post_started": null,
"post_completed": null,
"post_exit_code": null,
"created": "2015-08-28T17:57:41.033Z",
"queued": "2015-08-28T17:57:41.010Z",
"started": "2015-08-28T17:57:44.494Z",
"ended": "2015-08-28T17:57:45.906Z",
"last_modified": "2015-08-28T17:57:45.992Z",
"job": {
"id": 3,
"job_type": {
"id": 1,
"name": "scale-ingest",
"version": "1.0",
"title": "Scale Ingest",
"description": "Ingests a source file into a workspace",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": true,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013"
},
"job_type_rev": {
"id": 2
},
"event": {
"id": 3
},
"error": null,
"status": "COMPLETED",
"priority": 10,
"num_exes": 1
},
"node": {
"id": 1,
"hostname": "machine.com",
"port": 5051,
"slave_id": "20150821-123454-1683014024-5050-8216-S2",
"is_paused": false,
"is_active": true,
"archived": null,
"created": "2015-09-02T18:05:54.730Z",
"last_modified": "2015-09-08T16:53:57.439Z"
},
"error": null,
"environment": {...},
"cpus_scheduled": 0.5,
"mem_scheduled": 15360.0,
"disk_in_scheduled": 1.0,
"disk_out_scheduled": 0.0,
"disk_total_scheduled": 1.0,
"results": {
"output_data": [
{
"name": "output_file",
"file_id": 3
}
],
"version": "1.0"
},
"current_stdout_url": "http://host/out.txt",
"current_stderr_url": "http://host/error.txt",
"results_manifest": {
"output_data": [],
"version": "1.1",
"errors": [],
"parse_results": []
}
}
|
Job Execution Logs | ||
---|---|---|
Returns job execution logs for stdout and stderr. This will dynamically load the stdout and stderr for the currently running Mesos task if this job execution has not completed. These additional calls can add some overhead and processing so care should be taken not to poll this with high frequency. | ||
|
||
Successful Response | ||
Status | 200 OK | |
Content Type | application/json | |
JSON Fields | ||
id | Integer | The unique identifier of the model. Can be passed to the details API call. (See Job Execution Details) |
status | String | The status of the job execution. Choices: [QUEUED, RUNNING, FAILED, COMPLETED, CANCELED]. |
command_arguments | String | The argument string to execute on the command line for this job execution. This field is populated when the job execution is scheduled to run on a node and is updated when any needed pre-job steps are run. |
timeout | Integer | The maximum amount of time this job can run before being killed (in seconds). |
pre_started | ISO-8601 Datetime | When the pre-job steps were started on a node. |
pre_completed | ISO-8601 Datetime | When the pre-job steps were completed on a node. |
pre_exit_code | Integer | The exit code of the pre-steps job process for this job execution. |
job_started | ISO-8601 Datetime | When the actual job started running on a node. |
job_completed | ISO-8601 Datetime | When the actual job completed running on a node. |
job_exit_code | Integer | The exit code of the main job process for this job execution. |
post_started | ISO-8601 Datetime | When the post-job steps were started on a node. |
post_completed | ISO-8601 Datetime | When the post-job steps were completed on a node. |
post_exit_code | Integer | The exit code of the post-steps job process for this job execution. |
created | ISO-8601 Datetime | When the associated database model was initially created. |
queued | ISO-8601 Datetime | When the job was added to the queue for this run and went to QUEUED status. |
started | ISO-8601 Datetime | When the job was scheduled and went to RUNNING status. |
ended | ISO-8601 Datetime | When the job execution ended. (FAILED, COMPLETED, or CANCELED) |
last_modified | ISO-8601 Datetime | When the associated database model was last saved. |
job | JSON Object | The job that is associated with the execution. (See Job Details) |
node | JSON Object | The node that ran the execution. (See Node Details) |
error | JSON Object | The last error that was recorded for the execution. (See Error Details) |
stdout | String | Contents of stdout. |
stderr | String | Contents of stderr. |
{
"id": 3,
"status": "COMPLETED",
"command_arguments": "",
"timeout": 1800,
"pre_started": null,
"pre_completed": null,
"pre_exit_code": null,
"job_started": "2015-08-28T17:57:44.703Z",
"job_completed": "2015-08-28T17:57:45.906Z",
"job_exit_code": null,
"post_started": null,
"post_completed": null,
"post_exit_code": null,
"created": "2015-08-28T17:57:41.033Z",
"queued": "2015-08-28T17:57:41.010Z",
"started": "2015-08-28T17:57:44.494Z",
"ended": "2015-08-28T17:57:45.906Z",
"last_modified": "2015-08-28T17:57:45.992Z",
"job": {
"id": 3,
"job_type": {
"id": 1,
"name": "scale-ingest",
"version": "1.0",
"title": "Scale Ingest",
"description": "Ingests a source file into a workspace",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": true,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013"
},
"job_type_rev": {
"id": 2
},
"event": {
"id": 3
},
"error": null,
"status": "COMPLETED",
"priority": 10,
"num_exes": 1
},
"node": {
"id": 1,
"hostname": "machine.com",
"port": 5051,
"slave_id": "20150821-123454-1683014024-5050-8216-S2"
},
"error": null,
"is_finished": true,
"stdout": "Execution completed.",
"stderr": ""
}
|
Job Type Services¶
These services provide access to information about job types.
Job Type List | |||||
---|---|---|---|---|---|
Returns a list of all job types. | |||||
GET /job-types/ | |||||
Query Parameters | |||||
page | Integer | Optional | The page of the results to return. Defaults to 1. | ||
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. | ||
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). | ||
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). | ||
name | String | Optional | Return only job types with a given name. Duplicate it to filter by multiple values. | ||
category | String | Optional | Return only job types with a given category. Duplicate it to filter by multiple values. | ||
order | String | Optional | One or more fields to use when ordering the results. Duplicate it to multi-sort, (ex: order=name&order=version). Prefix fields with a dash to reverse the sort, (ex: order=-name). | ||
Successful Response | |||||
Status | 200 OK | ||||
Content Type | application/json | ||||
JSON Fields | |||||
count | Integer | The total number of results that match the query parameters. | |||
next | URL | A URL to the next page of results. | |||
previous | URL | A URL to the previous page of results. | |||
results | Array | List of result JSON objects that match the query parameters. | |||
.id | Integer | The unique identifier of the model. Can be passed to the details API. (See Job Type Details) | |||
.name | String | The stable name of the job type used for queries. | |||
.version | String | The version of the job type. | |||
.title | String | The human readable display name of the job type. | |||
.description | String | A longer description of the job type. | |||
.category | String | An optional overall category of the job type. | |||
.author_name | String | The name of the person or organization that created the job algorithm. | |||
.author_url | String | The address to a home page about the author or associated algorithm. | |||
.is_system | Boolean | Whether this is a system type. | |||
.is_long_running | Boolean | Whether this type is long running. A job of this type is intended to run for a long time, potentially indefinitely, without timing out and always being re-queued after a failure. | |||
.is_active | Boolean | Whether the job type is active (false once job type is archived). | |||
.is_paused | Boolean | Whether the job type is paused (while paused no jobs of this type will be scheduled off of the queue). | |||
.icon_code | String | A font-awesome icon code to use when representing this job type. | |||
.uses_docker | Boolean | Whether the job type uses Docker. | |||
.docker_image | String | The Docker image containing the code to run for this job. | |||
.revision_num | Integer | The current revision number of the job type, incremented for each edit. | |||
.priority | Integer | The priority of the job type (lower number is higher priority). | |||
.timeout | Integer | The maximum amount of time to allow a job of this type to run before being killed (in seconds). | |||
.max_scheduled | Integer | An optional number indicating the maximum number of jobs of this type that may be scheduled to run at the same time. May be ‘null’. | |||
.max_tries | Integer | The maximum number of times to try executing a job in case of errors. | |||
.cpus_required | Decimal | The number of CPUs needed for a job of this type. | |||
.mem_required | Decimal | The amount of RAM in MiB needed for a job of this type. | |||
.disk_out_const_required | Decimal | A constant amount of disk space in MiB required for job output. | |||
.disk_out_mult_required | Decimal | A multiplier (2x = 2.0) applied to the size of the input files to determine additional disk space in MiB required for job output. | |||
.created | ISO-8601 Datetime | When the associated database model was initially created. | |||
.archived | ISO-8601 Datetime | When the job type was archived (no longer active). | |||
.paused | ISO-8601 Datetime | When the job type was paused. | |||
.last_modified | ISO-8601 Datetime | When the associated database model was last saved. | |||
{
"count": 23,
"next": null,
"previous": null,
"results": [
{
"id": 3,
"name": "scale-clock",
"version": "1.0",
"title": "Scale Clock",
"description": "Monitors a directory for incoming files to ingest",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": true,
"is_long_running": true,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013",
"uses_docker": false,
"docker_privileged": false,
"docker_image": null,
"revision_num": 1,
"priority": 1,
"timeout": 0,
"max_scheduled": 1,
"max_tries": 0,
"cpus_required": 0.5,
"mem_required": 64.0,
"disk_out_const_required": 64.0,
"disk_out_mult_required": 0.0,
"created": "2015-03-11T00:00:00Z",
"archived": null,
"paused": null,
"last_modified": "2015-03-11T00:00:00Z"
},
...
]
}
|
Create Job Type | |||||
---|---|---|---|---|---|
Creates a new job type with associated interface and error mapping | |||||
POST /job-types/ | |||||
Content Type | application/json | ||||
JSON Fields | |||||
name | String | Required | The stable name of job type used for queries. | ||
version | String | Required | The version of the job type. | ||
title | String | Optional | The human-readable name of the job type. | ||
description | String | Optional | An optional description of the job type. | ||
category | String | Optional | An optional overall category of the job type. | ||
author_name | String | Optional | The name of the person or organization that created the algorithm associated with the job type. | ||
author_url | String | Optional | The address to a home page about the author or associated algorithm run by the job type. | ||
is_long_running | Boolean | Optional | Whether this type is long running. A job of this type is intended to run for a long time, potentially indefinitely, without timing out and always being re-queued after a failure. | ||
is_paused | Boolean | Optional | Whether the job type is paused (while paused no jobs of this type will be scheduled off of the queue). | ||
icon_code | String | Optional | A font-awesome icon code to use when displaying this job type. | ||
docker_image | String | Optional | The Docker image containing the code to run for this job. | ||
priority | Integer | Optional | The priority of the job type (lower number is higher priority). | ||
timeout | Integer |
|
|||
max_scheduled | Integer |
|
|||
max_tries | Integer | Optional | The maximum number of times to try executing a job when failed. | ||
cpus_required | Decimal | Optional | The number of CPUs needed for a job of this type. | ||
mem_required | Decimal | Optional | The amount of RAM in MiB needed for a job of this type. | ||
disk_out_const_required | Decimal | Optional | A constant amount of disk space in MiB required for job output. | ||
disk_out_mult_required | Decimal | Optional | A multiplier (2x = 2.0) applied to the size of input files to determine additional disk space in MiB required for job output. | ||
interface | JSON Object | Required | JSON description of the interface for running the job. (See Job Interface Specification Version 1.0) | ||
error_mapping | JSON Object | Optional | JSON description that maps exit codes to known error models. (See Error Interface Specification Version 1.0) | ||
trigger_rule | JSON Object | Optional | A linked trigger rule that automatically invokes the job type. Type and configuration fields are required if setting a rule. The is_active field is optional and can be used to pause. (See Trigger Rule Details | ||
{
"name": "my-job",
"version": "1.0",
"title": "My Job",
"description": "This is a description of the job",
"category": "test",
"author_name": null,
"author_url": null,
"is_long_running": false,
"is_operational": true,
"is_paused": false,
"icon_code": "f1c5",
"docker_privileged": false,
"docker_image": null,
"priority": 1,
"timeout": 0,
"max_tries": 0,
"cpus_required": 0.5,
"mem_required": 64.0,
"disk_out_const_required": 64.0,
"disk_out_mult_required": 0.0,
"interface": {
"version": "1.0",
"command": "test_cmd",
"command_arguments": "test_arg",
"input_data": [
{
"media_types": ["image/png"],
"type": "file",
"name": "input_file"
}
],
"output_data": [],
"shared_resources": []
},
"error_mapping": {
"version": "1.0",
"exit_codes": {
"1": "unknown"
}
},
"trigger_rule": {
"type": "PARSE",
"is_active": true,
"configuration": {
"version": "1.0",
"condition": {
"media_type": "image/png",
"data_types": []
},
"data": {
"input_data_name": "input_file",
"workspace_name": "rs"
}
}
}
}
|
|||||
Successful Response | |||||
Status | 201 CREATED | ||||
Content Type | application/json | ||||
JSON Fields | |||||
JSON Object | All fields are the same as the job type details model. (See Job Type Details) | ||||
{
"id": 100,
"name": "my-job",
"version": "1.0",
"title": "My Job",
"description": "This is a description of the job",
"category": "test",
"author_name": null,
"author_url": null,
"is_system": false,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f1c5",
"uses_docker": true,
"docker_privileged": false,
"docker_image": null,
"revision_num": 1,
"priority": 1,
"timeout": 0,
"max_scheduled": null,
"max_tries": 0,
"cpus_required": 0.5,
"mem_required": 64.0,
"disk_out_const_required": 64.0,
"disk_out_mult_required": 0.0,
"created": "2015-03-11T00:00:00Z",
"archived": null,
"paused": null,
"last_modified": "2015-03-11T00:00:00Z",
"interface": {...},
"error_mapping": {...},
"errors": [...],
"job_counts_6h": [...],
"job_counts_12h": [...],
"job_counts_24h": [...]
}
|
Validate Job Type | |||||||
---|---|---|---|---|---|---|---|
Validates a new job type without actually saving it | |||||||
POST /job-types/validation/ | |||||||
Content Type | application/json | ||||||
JSON Fields | |||||||
name | String | Required | The stable name of job type used for queries. | |||||||
version | String | Required | The version of the job type. | ||||
title | String | Optional | The human-readable name of the job type. | ||||
description | String | Optional | An optional description of the job type. | ||||
category | String | Optional | An optional overall category of the job type. | ||||
author_name | String | Optional | The name of the person or organization that created the algorithm associated with the job type. | ||||
author_url | String | Optional | The address to a home page about the author or associated algorithm run by the job type. | ||||
is_long_running | Boolean | Optional | Whether this type is long running. A job of this type is intended to run for a long time, potentially indefinitely, without timing out and always being re-queued after a failure. | ||||
is_paused | Boolean | Optional | Whether the job type is paused (while paused no jobs of this type will be scheduled off of the queue). | ||||
icon_code | String | Optional | A font-awesome icon code to use when displaying this job type. | ||||
docker_image | String | Optional | The Docker image containing the code to run for this job. | ||||
priority | Integer | Optional | The priority of the job type (lower number is higher priority). | ||||
timeout | Integer |
|
|||||
max_scheduled | Integer |
|
|||||
max_tries | Integer | Optional | The maximum number of times to try executing a job when failed. | ||||
cpus_required | Decimal | Optional | The number of CPUs needed for a job of this type. | ||||
mem_required | Decimal | Optional | The amount of RAM in MiB needed for a job of this type. | ||||
disk_out_const_required | Decimal | Optional | A constant amount of disk space in MiB required for job output. | ||||
disk_out_mult_required | Decimal | Optional | A multiplier (2x = 2.0) applied to the size of input files to determine additional disk space in MiB required for job output. | ||||
interface | JSON Object | Required | JSON description of the interface for running the job. (See Job Interface Specification Version 1.0) | ||||
error_mapping | JSON Object | Optional | JSON description that maps exit codes to known error models. (See Error Interface Specification Version 1.0) | ||||
trigger_rule | JSON Object | Optional | A linked trigger rule that automatically invokes the job type. Type and configuration fields are required if setting a rule. The is_active field is optional and can be used to pause. (See Trigger Rule Details | ||||
{
"name": "my-job",
"version": "1.0",
"title": "My Job",
"description": "This is a description of the job",
"category": "test",
"author_name": null,
"author_url": null,
"is_long_running": false,
"is_operational": true,
"is_paused": false,
"icon_code": "f1c5",
"docker_privileged": false,
"docker_image": null,
"priority": 1,
"timeout": 0,
"max_tries": 0,
"cpus_required": 0.5,
"mem_required": 64.0,
"disk_out_const_required": 64.0,
"disk_out_mult_required": 0.0,
"interface": {
"version": "1.0",
"command": "test_cmd",
"command_arguments": "test_arg",
"input_data": [
{
"media_types": ["image/png"],
"type": "file",
"name": "input_file"
}
],
"output_data": [],
"shared_resources": []
},
"error_mapping": {
"version": "1.0",
"exit_codes": {
"1": "unknown"
}
},
"trigger_rule": {
"type": "PARSE",
"is_active": true,
"configuration": {
"version": "1.0",
"condition": {
"media_type": "image/png",
"data_types": []
},
"data": {
"input_data_name": "input_file",
"workspace_name": "rs"
}
}
}
}
|
|||||||
Successful Response | |||||||
Status | 200 OK | ||||||
Content Type | application/json | ||||||
JSON Fields | |||||||
warnings | Array | A list of warnings discovered during validation. | |||||
.id | String | An identifier for the warning. | |||||
.details | String | A human-readable description of the problem. | |||||
{
"warnings": [
"id": "media_type",
"details": "Invalid media type for data input: input_file -> image/png"
]
}
|
Job Type Details | |||
---|---|---|---|
Returns job type details | |||
|
|||
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
id | Integer | The unique identifier of the model. | |
name | String | The stable name of the job type used for queries. | |
version | String | The version of the job type. | |
title | String | The human readable display name of the job type. | |
description | String | A longer description of the job type. | |
category | String | An optional overall category of the job type. | |
author_name | String | The name of the person or organization that created the job algorithm. | |
author_url | String | The address to a home page about the author or associated algorithm. | |
is_system | Boolean | Whether this is a system type. | |
is_long_running | Boolean | Whether this type is long running. A job of this type is intended to run for a long time, potentially indefinitely, without timing out and always being re-queued after a failure. | |
is_active | Boolean | Whether the job type is active (false once job type is archived). | |
is_operational | Boolean | Whether this job type is operational (True) or is still in a research & development (R&D) phase (False). | |
is_paused | Boolean | Whether the job type is paused (while paused no jobs of this type will be scheduled off of the queue). | |
icon_code | String | A font-awesome icon code to use when representing this job type. | |
uses_docker | Boolean | Whether the job type uses Docker. | |
docker_image | String | The Docker image containing the code to run for this job. | |
revision_num | Integer | The current revision number of the job type, incremented for each edit. | |
priority | Integer | The priority of the job type (lower number is higher priority). | |
timeout | Integer | The maximum amount of time to allow a job of this type to run before being killed (in seconds). | |
max_scheduled | Integer | An optional number indicating the maximum number of jobs of this type that may be scheduled to run at the same time. May be ‘null’. | |
max_tries | Integer | The maximum number of times to try executing a job in case of errors. | |
cpus_required | Decimal | The number of CPUs needed for a job of this type. | |
mem_required | Decimal | The amount of RAM in MiB needed for a job of this type. | |
disk_out_const_required | Decimal | A constant amount of disk space in MiB required for job output. | |
disk_out_mult_required | Decimal | A multiplier (2x = 2.0) applied to the size of the input files to determine additional disk space in MiB required for job output. | |
created | ISO-8601 Datetime | When the associated database model was initially created. | |
archived | ISO-8601 Datetime | When the job type was archived (no longer active). | |
paused | ISO-8601 Datetime | When the job type was paused. | |
last_modified | ISO-8601 Datetime | When the associated database model was last saved. | |
interface | JSON Object | JSON description defining the interface for running a job of this type. (See Job Interface Specification Version 1.0) | |
error_mapping | JSON Object | JSON description defining the error mappings for a job of this type. (See Error Interface Specification Version 1.0) | |
trigger_rule | JSON Object | A linked trigger rule that automatically invokes the job type. Type and configuration fields are required if setting a rule. The is_active field is optional and can be used to pause. (See Trigger Rule Details | |
errors | Array | List of all errors that are referenced by this job type’s error mapping. (See Error Details) | |
.job_counts_6h | Array | List of job counts for the job type, grouped by status the past 6 hours. | |
..status | String | The type of job status the count represents. | |
..count | Integer | The number of jobs with that status. | |
..most_recent | ISO-8601 Datetime | The date/time when a job was last in that status. | |
..category | String | The category of the status, which is only used by a FAILED status. | |
.job_counts_12h | Array | List of job counts for the job type, grouped by status the past 12 hours. | |
..status | String | The type of job status the count represents. | |
..count | Integer | The number of jobs with that status. | |
..most_recent | ISO-8601 Datetime | The date/time when a job was last in that status. | |
..category | String | The category of the status, which is only used by a FAILED status. | |
.job_counts_24h | Array | List of job counts for the job type, grouped by status the past 24 hours. | |
..status | String | The type of job status the count represents. | |
..count | Integer | The number of jobs with that status. | |
..most_recent | ISO-8601 Datetime | The date/time when a job was last in that status. | |
..category | String | The category of the status, which is only used by a FAILED status. | |
{
"id": 3,
"name": "scale-clock",
"version": "1.0",
"title": "Scale Clock",
"description": "Monitors a directory for incoming files to ingest",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": true,
"is_long_running": true,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013",
"uses_docker": false,
"docker_privileged": false,
"docker_image": null,
"revision_num": 1,
"priority": 1,
"timeout": 0,
"max_scheduled": null,
"max_tries": 0,
"cpus_required": 0.5,
"mem_required": 64.0,
"disk_out_const_required": 64.0,
"disk_out_mult_required": 0.0,
"created": "2015-03-11T00:00:00Z",
"archived": null,
"paused": null,
"last_modified": "2015-03-11T00:00:00Z"
"interface": {...},
"error_mapping": {...},
"trigger_rule": {...},
"errors": [...],
"job_counts_6h": [
{
"status": "QUEUED",
"count": 3,
"most_recent": "2015-09-16T18:36:12.278Z",
"category": null
}
],
"job_counts_12h": [
{
"status": "QUEUED",
"count": 3,
"most_recent": "2015-09-16T18:36:12.278Z",
"category": null
},
{
"status": "COMPLETED",
"count": 225,
"most_recent": "2015-09-16T18:40:01.101Z",
"category": null
}
],
"job_counts_24h": [
{
"status": "QUEUED",
"count": 3,
"most_recent": "2015-09-16T18:36:12.278Z",
"category": null
},
{
"status": "COMPLETED",
"count": 419,
"most_recent": "2015-09-16T18:40:01.101Z",
"category": null
},
{
"status": "FAILED",
"count": 1,
"most_recent": "2015-09-16T10:01:34.308Z",
"category": "SYSTEM"
}
]
}
|
Edit Job Type | |||||
---|---|---|---|---|---|
Edits an existing job type with associated interface and error mapping | |||||
|
|||||
Content Type | application/json | ||||
JSON Fields | |||||
title | String | Optional | The human-readable name of the job type. | ||
description | String | Optional | An optional description of the job type. | ||
category | String | Optional | An optional overall category of the job type. | ||
author_name | String | Optional | The name of the person or organization that created the algorithm associated with the job type. | ||
author_url | String | Optional | The address to a home page about the author or associated algorithm run by the job type. | ||
is_long_running | Boolean | Optional | Whether this type is long running. A job of this type is intended to run for a long time, potentially indefinitely, without timing out and always being re-queued after a failure. | ||
is_paused | Boolean | Optional | Whether the job type is paused (while paused no jobs of this type will be scheduled off of the queue). | ||
icon_code | String | Optional | A font-awesome icon code to use when displaying this job type. | ||
docker_image | String | Optional | The Docker image containing the code to run for this job. | ||
priority | Integer | Optional | The priority of the job type (lower number is higher priority). | ||
timeout | Integer |
|
|||
max_scheduled | Integer |
|
|||
max_tries | Integer | Optional | The maximum number of times to try executing a job when failed. | ||
cpus_required | Decimal | Optional | The number of CPUs needed for a job of this type. | ||
mem_required | Decimal | Optional | The amount of RAM in MiB needed for a job of this type. | ||
disk_out_const_required | Decimal | Optional | A constant amount of disk space in MiB required for job output. | ||
disk_out_mult_required | Decimal | Optional | A multiplier (2x = 2.0) applied to the size of input files to determine additional disk space in MiB required for job output. | ||
interface | JSON Object | Optional | JSON description of the interface for running the job. (See Job Interface Specification Version 1.0) | ||
error_mapping | JSON Object | Optional | JSON description that maps exit codes to known error models. (See Error Interface Specification Version 1.0) | ||
trigger_rule | JSON Object | Optional | A linked trigger rule that automatically invokes the job type. Type and configuration fields are required if setting a rule. The is_active field is optional and can be used to pause. (See Trigger Rule Details | ||
{
"title": "My Job",
"description": "This is a description of the job",
"category": "test",
"author_name": null,
"author_url": null,
"is_long_running": false,
"is_operational": true,
"is_paused": false,
"icon_code": "f1c5",
"docker_privileged": false,
"docker_image": null,
"priority": 1,
"timeout": 0,
"max_scheduled": 2,
"max_tries": 0,
"cpus_required": 0.5,
"mem_required": 64.0,
"disk_out_const_required": 64.0,
"disk_out_mult_required": 0.0,
"interface": {
"version": "1.0",
"command": "test_cmd",
"command_arguments": "test_arg",
"input_data": [
{
"media_types": ["image/png"],
"type": "file",
"name": "input_file"
}
],
"output_data": [],
"shared_resources": []
},
"error_mapping": {
"version": "1.0",
"exit_codes": {
"1": "unknown"
}
},
"trigger_rule": {
"type": "PARSE",
"is_active": true,
"configuration": {
"version": "1.0",
"condition": {
"media_type": "image/png",
"data_types": []
},
"data": {
"input_data_name": "input_file",
"workspace_name": "rs"
}
}
}
}
|
|||||
Successful Response | |||||
Status | 200 OK | ||||
Content Type | application/json | ||||
JSON Fields | |||||
JSON Object | All fields are the same as the job type details model. (See Job Type Details) | ||||
{
"id": 100,
"name": "my-job",
"version": "1.0",
"title": "My Job",
"description": "This is a description of the job",
"category": "test",
"author_name": null,
"author_url": null,
"is_system": false,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f1c5",
"uses_docker": true,
"docker_privileged": false,
"docker_image": null,
"revision_num": 1,
"priority": 1,
"timeout": 0,
"max_scheduled": 1,
"max_tries": 0,
"cpus_required": 0.5,
"mem_required": 64.0,
"disk_out_const_required": 64.0,
"disk_out_mult_required": 0.0,
"created": "2015-03-11T00:00:00Z",
"archived": null,
"paused": null,
"last_modified": "2015-03-11T00:00:00Z",
"interface": {...},
"error_mapping": {...},
"errors": [...],
"job_counts_6h": [...],
"job_counts_12h": [...],
"job_counts_24h": [...]
}
|
Job Types Status | |||
---|---|---|---|
Returns a list of overall job type statistics, based on counts of jobs organized by status. Note that all jobs with a status of RUNNING are included regardless of date/time filters. | |||
GET /job-types/status/ | |||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). Defaults to the past 3 hours. |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.job_type | JSON Object | The job type that is associated with the statistics. (See Job Type Details) | |
.job_counts | Array | A list of recent job counts for the job type, grouped by status. | |
..status | String | The type of job status the count represents. | |
..count | Integer | The number of jobs with that status. | |
..most_recent | ISO-8601 Datetime | The date/time when a job was last in that status. | |
..category | String | The category of the status, which is only used by a FAILED status. | |
"count": 2,
"next": null,
"previous": null,
"results": [
{
"job_type": {
"id": 1,
"name": "scale-ingest",
"version": "1.0",
"title": "Scale Ingest",
"description": "Ingests a source file into a workspace",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": true,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013"
},
"job_counts": [
{
"status": "RUNNING",
"count": 1,
"most_recent": "2015-08-31T22:09:12.674Z",
"category": null
},
{
"status": "FAILED",
"count": 2,
"most_recent": "2015-08-31T19:28:30.799Z",
"category": "SYSTEM"
},
{
"status": "COMPLETED",
"count": 57,
"most_recent": "2015-08-31T21:51:40.900Z",
"category": null
}
],
},
{
"job_type": {
"id": 3,
"name": "scale-clock",
"version": "1.0",
"title": "Scale Clock",
"description": "Monitors a directory for incoming files to ingest",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": true,
"is_long_running": true,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013"
},
"job_counts": []
},
...
]
|
Job Types Running | ||
---|---|---|
Returns counts of job types that are running, ordered by the longest running job. | ||
GET /job-types/running/ | ||
Successful Response | ||
Status | 200 OK | |
Content Type | application/json | |
JSON Fields | ||
count | Integer | The total number of results that match the query parameters. |
next | URL | A URL to the next page of results. |
previous | URL | A URL to the previous page of results. |
results | Array | List of result JSON objects that match the query parameters. |
.job_type | JSON Object | The job type that is associated with the count. (See Job Type Details) |
.count | Integer | The number of jobs of this type that are currently running. |
.longest_running | ISO-8601 Datetime | The run start time of the job of this type that has been running the longest. |
{
"count": 5,
"next": null,
"previous": null,
"results": [
{
"job_type": {
"id": 3,
"name": "scale-clock",
"version": "1.0",
"title": "Scale Clock",
"description": "",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": true,
"is_long_running": true,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013"
},
"count": 1,
"longest_running": "2015-09-08T15:43:15.681Z"
},
...
]
}
|
Job Type System Failures | ||
---|---|---|
Returns counts of job types that have a critical system failure error, ordered by last error. | ||
GET /job-types/system-failures/ | ||
Successful Response | ||
Status | 200 OK | |
Content Type | application/json | |
JSON Fields | ||
count | Integer | The total number of results that match the query parameters. |
next | URL | A URL to the next page of results. |
previous | URL | A URL to the previous page of results. |
results | Array | List of result JSON objects that match the query parameters. |
.job_type | JSON Object | The job type that is associated with the count. (See Job Type Details) |
.count | Integer | The number of jobs of this type that are currently running. |
.error | JSON Object | The error that is associated with the count. (See Error Details) |
.first_error | ISO-8601 Datetime | When this error first occurred for a job of this type. |
.last_error | ISO-8601 Datetime | When this error most recently occurred for a job of this type. |
{
"count": 5,
"next": null,
"previous": null,
"results": [
{
"job_type": {
"id": 3,
"name": "scale-clock",
"version": "1.0",
"title": "Scale Clock",
"description": "",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": true,
"is_long_running": true,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013"
},
"error": {
"id": 1,
"name": "Unknown",
"description": "The error that caused the failure is unknown.",
"category": "SYSTEM",
"created": "2015-03-11T00:00:00Z",
"last_modified": "2015-03-11T00:00:00Z"
},
"count": 38,
"first_error": "2015-08-28T23:29:28.719Z",
"last_error": "2015-09-08T16:27:42.243Z"
},
...
]
}
|
Metrics Services¶
These services provide access to information about processing counts and timings.
Metrics List | |||
---|---|---|---|
Returns a list of all metrics types. | |||
GET /metrics/ | |||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.name | String | The stable name of the metrics type used for queries. | |
.title | String | The human readable display name of the metrics type. | |
.description | String | A longer description of the metrics type. | |
.filters | Array | The filter parameters that can be used to query the metrics type. | |
..param | String | The stable name of the parameter used for queries. | |
..type | String | The data type of the parameter clients can use for validation. Example: bool, date, datetime, float, int, string, time, int | |
.groups | Array | The group definitions that can be used to select the results returned. | |
..name | String | The stable name of the metrics group used for queries. | |
..title | String | The human readable display name of the metrics group. | |
..description | String | A longer description of the metrics group. | |
.columns | Array | The column definitions that can be used to select the results returned. | |
..name | String | The stable name of the metrics column used for queries. | |
..title | String | The human readable display name of the metrics column. | |
..description | String | A longer description of the metrics column. | |
..units | String | Each value for the metrics column is converted to this type of unit. Examples: count, seconds | |
..group | String | Some metric columns are related together, which is indicated by the group name. | |
..aggregate | String | The math operation used to aggregate certain types of metrics. Examples: avg, max, min, sum | |
{
"count": 10,
"next": null,
"previous": null,
"results": [
{
"name": "job-types",
"title": "Job Types",
"description": "Metrics for jobs and executions grouped by job type.",
"filters": [
{
"param": "name",
"type": "string"
},
...
],
"groups": [
{
"name": "overview",
"title": "Overview",
"description": "Overall counts based on job status."
},
...
],
"columns": [
{
"name": "completed_count",
"title": "Completed Count",
"description": "Number of successfully completed jobs.",
"units": "count",
"group": "overview",
"aggregate": "sum"
},
...
]
},
...
]
}
|
Metric Details | ||
---|---|---|
Returns a specific metrics type and all its related model information including possible filter choices. | ||
|
||
Successful Response | ||
Status | 200 OK | |
Content Type | application/json | |
JSON Fields | ||
name | String | The stable name of the metrics type used for queries. |
title | String | The human readable display name of the metrics type. |
description | String | A longer description of the metrics type. |
filters | Array | The filter parameters that can be used to query the metrics type. |
.param | String | The stable name of the parameter used for queries. |
.type | String | The data type of the parameter clients can use for validation. Example: bool, date, datetime, float, int, string, time, int |
columns | Array | The column definitions that can be used to select the results returned. |
.name | String | The stable name of the metrics column used for queries. |
.title | String | The human readable display name of the metrics column. |
.description | String | A longer description of the metrics column. |
.units | String | Each value for the metrics column is converted to this type of unit. Examples: count, seconds |
.group | String | Some metric columns are related together, which is indicated by the group name. |
.aggregate | String | The math operation used to aggregate certain types of metrics. Examples: avg, max, min, sum |
choices | Array | The related model choices that can be used to filter the metrics records. All of the filter parameters described above are fields within the model. The list of choices allow clients to restrict filtering to only valid combinations. Each choice model is specific to a metrics type and so the actual fields vary. |
{
"name": "job-types",
"title": "Job Types",
"description": "Metrics for jobs and executions grouped by job type.",
"filters": [
{
"param": "name",
"type": "string"
},
{
"param": "version",
"type": "string"
}
],
"columns": [
{
"name": "completed_count",
"title": "Completed Count",
"description": "Number of successfully completed jobs.",
"units": "count",
"group": "overview",
"aggregate": "sum"
},
...
]
"choices": [
{
"id": 4,
"name": "scale-clock",
"version": "1.0",
"title": "Scale Clock",
"description": "Performs Scale system functions that need to be executed on regular time intervals",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": true,
"is_long_running": true,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013"
},
...
]
}
|
Metric Plot Data | |||
---|---|---|---|
Returns all the plot values for a metrics type based on optional query parameters. | |||
|
|||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
choice_id | Integer | Optional | Return only metrics associated with the related model choice. Each of these values must be one of the items in the choices list. Duplicate it to filter by multiple values. When no choice filters are used, then values are aggregated across all the choices by date. |
column | String | Optional | Include only metrics with the given column name. The column name corresponds with a single statistic, such as completed count. Duplicate it to filter by multiple values. |
group | String | Optional | Include only metrics with the given group name. The group name corresponds with a collection of related statistics. Duplicate it to filter by multiple values. |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.column | Array | The column definition of the selected plot data values. | |
..name | String | The stable name of the metrics column used for queries. | |
..title | String | The human readable display name of the metrics column. | |
..description | String | A longer description of the metrics column. | |
..units | String | Each value for the metrics column is converted to this type of unit. Examples: count, seconds | |
..group | String | Some metric columns are related together, which is indicated by the group name. | |
..aggregate | String | The math operation used to aggregate certain types of metrics. Examples: avg, max, min, sum | |
.min_x | ISO-8601 Date | The minimum value within the x-axis for the metric column. The x-axis will always be based on time and consist of a single date. Supports the ISO-8601 date format, (ex: 2015-01-01). | |
.max_x | ISO-8601 Date | The maximum value within the x-axis for the metric column. The x-axis will always be based on time and consist of a single date. Supports the ISO-8601 date format, (ex: 2015-12-31). | |
.min_y | Integer | The minimum value within the y-axis for the metric column. The y-axis will always be a simple numeric value. | |
.max_y | Integer | The maximum value within the y-axis for the metric column. The y-axis will always be a simple numeric value. | |
.values | Array | List of plot value JSON objects for each choice and date in the data series. Note that the values are sorted oldest to newest. | |
..id | Integer | The unique identifier of the related choice model for this data value. This field is omitted when there are no choice filters or only 1 specified. | |
..date | ISO-8601 Date | The date when the plot value occurred. Uses the ISO-8601 date format, (ex: 2015-12-31). | |
..value | Integer | The statistic value that was calculated for the date. | |
{
"count": 28,
"next": null,
"previous": null,
"results": [
{
"column": {
"name": "run_time_min",
"title": "Run Time (Min)",
"description": "Minimum time spent running the pre, job, and post tasks.",
"units": "seconds",
"group": "run_time",
"aggregate": "min"
},
"min_x": "2015-10-05",
"max_x": "2015-10-13",
"min_y": 1,
"max_y": 300,
"values": [
{
"id": 1,
"date": "2015-10-05",
"value": 1
},
...
]
},
...
]
}
|
Node Services¶
These services provide access to information about the nodes.
Node List | |||
---|---|---|---|
Returns a list of all nodes. | |||
GET /nodes/ | |||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. | |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
order | String | Optional | One or more fields to use when ordering the results. Duplicate it to multi-sort, (ex: order=host_name&order=created). Prefix fields with a dash to reverse the sort, (ex: order=-created). |
include_inactive | Boolean | Optional | If true, all nodes in the database are returned including those marked inactive. These are typically removed from the cluster. The default is False. |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.id | Integer | The unique identifier of the model. Can be passed to the details API call. (See Node Details) | |
.hostname | String | The full domain-qualified hostname of the node. | |
.port | Integer | The port being used by the executor on this node. | |
.slave_id | String | The slave ID used by Mesos for the node. | |
.pause_reason | String | The reason this node is paused if is_paused is true. This is a descriptive field for presentation to the user. | |
.is_paused | Boolean | True if the node is paused and will not accept new jobs for execution. Remaining tasks for a previously executing job will complete. | |
.is_paused_errors | True if the node was automatically paused due to a high error rate. | ||
.is_active | Boolean | True if the node is actively participating in the cluster. | |
.archived | ISO-8601 Datetime | (Optional) When the node was removed (is_active == False) from the cluster. | |
.created | ISO-8601 Datetime | When the associated database model was initially created. | |
.last_offer | ISO-8601 Datetime | When the node last received an offer from Mesos. | |
.last_modified | ISO-8601 Datetime | When the associated database model was last saved. | |
{
"count": 9,
"next": null,
"previous": null,
"results": [
{
"id": 4,
"hostname": "host.com",
"port": 5051,
"slave_id": "20150828-143216-659603848-5050-13473-S9",
"is_paused": false,
"is_paused_errors": false,
"is_active": true,
"archived": null,
"created": "2015-08-28T18:32:33.954Z",
"last_offer": null,
"last_modified": "2015-09-04T13:53:46.670Z"
},
...
]
}
|
Node Details | |||
---|---|---|---|
Returns a specific job and all its related model information including resource usage. | |||
|
|||
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
id | Integer | The unique identifier of the model. Can be passed to the details API call. (See Node Details) | |
hostname | String | The full domain-qualified hostname of the node. | |
port | Integer | The port being used by the executor on this node. | |
slave_id | String | The slave ID used by Mesos for the node. | |
pause_reason | String | The reason this node is paused if is_paused is true. This is a descriptive field for presentation to the user. | |
is_paused | Boolean | True if the node is paused and will not accept new jobs for execution. Remaining tasks for a previously executing job will complete. | |
.is_paused_errors | True if the node was automatically paused due to a high error rate. | ||
is_active | Boolean | True if the node is actively participating in the cluster. | |
archived | ISO-8601 Datetime | (Optional) When the node was removed (is_active == False) from the cluster. | |
created | ISO-8601 Datetime | When the associated database model was initially created. | |
last_offer | ISO-8601 Datetime | When the node last received an offer from Mesos. | |
last_modified | ISO-8601 Datetime | When the associated database model was last saved. | |
resources | JSON Object | (Optional) Information about the hardware resources of the node NOTE: Resource information may not always be available | |
.total | JSON Object | The total hardware resources for the node | |
..cpus | Float | The total number of CPUs at this node | |
..mem | Float | The total amount of RAM in MiB at this node | |
..disk | Float | The total amount of disk space in MiB at this node | |
.scheduled | JSON Object | The scheduled hardware resources for the node | |
..cpus | Float | The scheduled number of CPUs at this node | |
..mem | Float | The scheduled amount of RAM in MiB at this node | ||
..disk | Float | The scheduled amount of disk space in MiB at this node | ||
.used | JSON Object | The used hardware resources for all nodes in the cluster NOTE: Real-time resource usage is not currently available and will be all zero | |
..cpus | Float | The used number of CPUs at this node | |
..mem | Float | The used amount of RAM in MiB at this node | |
..disk | Float | The used amount of disk space in MiB at this node | |
disconnected | Boolean | (Optional) If present and true, there is an active Node entry in the scale database but mesos does not have a corresponding active slave. | |
.job_exes_running | Array | A list of job executions currently running on the node. (See Job Execution Details) | |
{
"id": 4,
"hostname": "host.com",
"port": 5051,
"slave_id": "20150616-103057-1800454536-5050-6193-S2",
"is_paused": false,
"is_paused_errors": false,
"is_active": true,
"archived": null,
"created": "2015-06-15T17:18:52.414Z",
"last_offer": null,
"last_modified": "2015-06-17T20:05:16.041Z",
"job_exes_running": [
{
"id": 1,
"status": "RUNNING",
"command_arguments": "",
"timeout": 0,
"pre_started": null,
"pre_completed": null,
"pre_exit_code": null,
"job_started": "2015-08-28T18:32:34.295Z",
"job_completed": null,
"job_exit_code": null,
"post_started": null,
"post_completed": null,
"post_exit_code": null,
"created": "2015-08-28T18:32:33.862Z",
"queued": "2015-08-28T18:32:33.833Z",
"started": "2015-08-28T18:32:34.040Z",
"ended": null,
"last_modified": "2015-08-28T18:32:34.389Z",
"job": {
"id": 1,
"job_type": {
"id": 3,
"name": "scale-clock",
"version": "1.0",
"title": "Scale Clock",
"description": "Performs Scale system functions that need to be executed periodically",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": true,
"is_long_running": true,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013"
},
"job_type_rev": {
"id": 5,
},
"event": {
"id": 1
},
"error": null,
"status": "RUNNING",
"priority": 1,
"num_exes": 19
},
"node": {
"id": 7
},
"error": null,
"cpus_scheduled": 1.0,
"mem_scheduled": 1024.0,
"disk_in_scheduled": 0.0,
"disk_out_scheduled": 0.0,
"disk_total_scheduled": 0.0
}
],
"resources": {
"total": {
"cpus": 16.0,
"mem": 63305.0,
"disk": 131485.0
},
"scheduled": {
"cpus": 12.0,
"mem": 35392.0,
"disk": 131408.0
},
"used": {
"cpus": 16.0,
"mem": 63305.0,
"disk": 131485.0
}
}
}
|
|||
Error Responses | |||
Status | 404 NOT FOUND | ||
Content Type | text/plain | ||
The specified slave_id does not exist in the database. |
Replace Node | ||
---|---|---|
Replaces node data with specified data | ||
|
||
Content Type | application/json | |
JSON Fields | ||
hostname | String | The full domain-qualified hostname of the node. |
port | Integer | The port being used by the executor on this node. |
pause_reason | String | The reason this node is paused if is_paused is true. If is_paused is false this field will be set to null. This should provide a brief description for user display. |
is_paused | Boolean | True if the node is paused and will not accept new jobs for execution. Remaining tasks for a previously executing job will complete. |
Successful Response | ||
Status | 201 CREATED | |
Location | URL pointing to the details for the node (should be the same as the request URL) | |
Content Type | application/json | |
Response format is identical to GET but contains the updated data. | ||
JSON Fields | ||
hostname | String | The full domain-qualified hostname of the node. |
port | Integer | The port being used by the executor on this node. |
slave_id | String | The slave ID used by Mesos for the node. |
pause_reason | String | The reason this node is paused if is_paused is true. This is a descriptive field for presentation to the user. |
is_paused | Boolean | True if the node is paused and will not accept new jobs for execution. Remaining tasks for a previously executing job will complete. |
.is_paused_errors | True if the node was automatically paused due to a high error rate. | |
created | ISO-8601 Datetime | When the associated database model was initially created. |
last_modified | ISO-8601 Datetime | When the associated database model was last saved. |
Error Responses | ||
Status | 400 BAD REQUEST | |
Content Type | text/plain | |
Bad update fields were specified, either unexpected fields or there were missing fields. An error message lists them. | ||
Status | 404 NOT FOUND | |
Content Type | text/plain | |
The specified slave_id does not exist in the database. |
Update Node | ||
---|---|---|
Update one or more fields in an existing node. | ||
|
||
Content Type | application/json | |
JSON Fields | ||
hostname | String | (Optional) The full domain-qualified hostname of the node. |
port | Integer | (Optional) The port being used by the executor on this node. |
pause_reason | String | (Optional) The reason this node is paused if is_paused is true. If is_paused is false, this field will be set to null. This should provide a brief description for user display. |
is_paused | Boolean | (Optional) True if the node is paused and will not accept new jobs for execution. Remaining tasks for a previously executing job will complete. |
Successful Response | ||
Status | 201 CREATED | |
Location | URL pointing to the details for the node (should be the same as the request URL). | |
Content Type | application/json | |
Response format is identical to GET but contains the updated data. | ||
JSON Fields | ||
hostname | String | The full domain-qualified hostname of the node. |
port | Integer | The port being used by the executor on this node. |
slave_id | String | The slave ID used by Mesos for the node. |
is_paused | Boolean | True if the node is paused and will not accept new jobs for execution. Remaining tasks for a previously executing job will complete. |
.is_paused_errors | True if the node was automatically paused due to a high error rate. | |
created | ISO-8601 Datetime | When the associated database model was initially created. |
last_modified | ISO-8601 Datetime | When the associated database model was last saved. |
Error Responses | ||
Status | 400 BAD REQUEST | |
Content Type | text/plain | |
Unexpected fields were specified. An error message lists them. Or no fields were specified. | ||
Status | 404 NOT FOUND | |
Content Type | text/plain | |
The specified slave_id does not exist in the database. |
Nodes Status | |||
---|---|---|---|
Returns a list of overall node statistics, based on counts of job executions organized by status. This only returns data for nodes marked as active in the database. For status information on nodes which are no longer in the cluseter (is_active is false), request node details for that specific node ID. | |||
GET /nodes/status/ | |||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.node | JSON Object | The node that is associated with the statistics. (See Node Details) | |
.is_online | Boolean | (Optional) Whether or not the node is running and available. | |
.job_exe_counts | Array | A list of recent job execution counts for the node, grouped by status. | |
..status | String | The type of job execution status the count represents. | |
..count | Integer | The number of job executions for the status attempted by the node. | |
..most_recent | ISO-8601 Datetime | The date/time when the node last ran a job execution with the status. | |
..category | String | The category of the status, which is only used by a FAILED status. | |
.job_exes_running | Array | A list of job executions currently running on the node. (See Job Execution Details) | |
"count": 2,
"next": null,
"previous": null,
"results": [
{
"node": {
"id": 2
"hostname": "host1.com",
"port": 5051,
"slave_id": "20150821-144617-659603848-5050-22035-S2",
"is_paused": false,
"is_paused_errors": false,
"is_active": true,
"archived": null,
"created": "2015-07-08T17:49:21.771Z",
"last_modified": "2015-07-08T17:49:21.771Z",
},
"is_online": true,
"job_exe_counts": [
{
"status": "RUNNING",
"count": 1,
"most_recent": "2015-08-31T22:09:12.674Z",
"category": null
},
{
"status": "FAILED",
"count": 2,
"most_recent": "2015-08-31T19:28:30.799Z",
"category": "SYSTEM"
},
{
"status": "COMPLETED",
"count": 57,
"most_recent": "2015-08-31T21:51:40.900Z",
"category": null
}
],
"job_exes_running": [
{
"id": 1,
"status": "RUNNING",
"command_arguments": "",
"timeout": 0,
"pre_started": null,
"pre_completed": null,
"pre_exit_code": null,
"job_started": "2015-08-28T18:32:34.295Z",
"job_completed": null,
"job_exit_code": null,
"post_started": null,
"post_completed": null,
"post_exit_code": null,
"created": "2015-08-28T18:32:33.862Z",
"queued": "2015-08-28T18:32:33.833Z",
"started": "2015-08-28T18:32:34.040Z",
"ended": null,
"last_modified": "2015-08-28T18:32:34.389Z",
"job": {
"id": 1,
"job_type": {
"id": 3,
"name": "scale-clock",
"version": "1.0",
"title": "Scale Clock",
"description": "Performs Scale system functions that need to be executed periodically",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": true,
"is_long_running": true,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013"
},
"job_type_rev": {
"id": 5,
},
"event": {
"id": 1
},
"error": null,
"status": "RUNNING",
"priority": 1,
"num_exes": 19
},
"node": {
"id": 7
},
"error": null,
"cpus_scheduled": 1.0,
"mem_scheduled": 1024.0,
"disk_in_scheduled": 0.0,
"disk_out_scheduled": 0.0,
"disk_total_scheduled": 0.0
}
]
},
{
"node": {
"id": 1
"hostname": "host2.com",
"port": 5051,
"slave_id": "20150821-144617-659603848-5050-22035-S1",
"is_paused": false,
"is_paused_errors": false,
"is_active": true,
"archived": null,
"created": "2015-07-08T17:49:21.771Z",
"last_modified": "2015-07-08T17:49:21.771Z"
},
"is_online": false,
"job_exe_counts": [],
"job_exes_running": []
},
...
]
|
Import/Export Services¶
These services allow administrators to export recipe, job, and error records and safely import them to another system.
Export | |||
---|---|---|---|
Exports configuration records for recipe types, job types, and errors. | |||
GET /configuration/ | |||
Query Parameters | |||
include | String | Optional | The types of records to include in the export. Defaults to all. Choices: [recipe_types, job_types, errors]. Duplicate it to filter by multiple values. |
recipe_type_id | Integer | Optional | Return only recipe types with a given recipe type identifier. Duplicate it to filter by multiple values. |
recipe_type_name | String | Optional | Return only recipe types with a given recipe type name. Duplicate it to filter by multiple values. |
job_type_id | Integer | Optional | Return only job types with a given job type identifier. Duplicate it to filter by multiple values. |
job_type_name | String | Optional | Return only job types with a given job type name. Duplicate it to filter by multiple values. |
job_type_category | String | Optional | Return only job types with a given job type category. Duplicate it to filter by multiple values. |
error_id | Integer | Optional | Return only errors with a given error identifier. Duplicate it to filter by multiple values. |
error_name | String | Optional | Return only errors with a given error name. Duplicate it to filter by multiple values. |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
version | String | The version number of the configuration schema. | |
recipe_types | Array | List of exported recipe types. (See Recipe Type Details) | |
job_types | Array | List of exported job types. (See Job Type Details) | |
errors | Array | List of exported errors. (See Error Details) | |
{
"version": "1.0",
"recipe_types": [
{
"name": "my-recipe",
"version": "1.0.0",
"title": "My Recipe",
"description": "Runs my recipe",
"definition": {...},
"trigger_rule": {...}
},
...
],
"job_types": [
{
"name": "my-job",
"version": "1.0.0",
"title": "My Job",
"description": "Runs my job",
"category": null,
"author_name": null,
"author_url": null,
"is_system": false,
"is_long_running": false,
"is_operational": true,
"icon_code": "f013",
"uses_docker": true,
"docker_privileged": false,
"docker_image": null,
"priority": 1,
"timeout": 0,
"max_scheduled": 1,
"max_tries": 0,
"cpus_required": 1.0,
"mem_required": 64.0,
"disk_out_const_required": 64.0,
"disk_out_mult_required": 0.0,
"interface": {...},
"error_mapping": {...},
"trigger_rule": {...}
},
...
],
"errors": [
{
"name": "bad-data",
"title": "Bad Data",
"description": "Bad data detected",
"category": "DATA"
},
...
]
}
|
Import | |||
---|---|---|---|
Imports configuration records for recipe types, job types, and errors. | |||
POST /configuration/ | |||
Content Type | application/json | |||
JSON Fields | |||
import | JSON Object | Required | The previously exported configuration to load. |
.version | String | Optional | The version number of the configuration schema. Defaults to the latest version. |
.recipe_types | Array | Optional | List of recipe types to import. (See Recipe Type Details) |
.job_types | Array | Optional | List of job types to import. (See Job Type Details) |
.errors | Array | Optional | List of errors to import. (See Error Details) |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
warnings | Array | A list of warnings discovered during import. | |
.id | String | An identifier for the warning. | |
.details | String | A human-readable description of the problem. | |
{
"warnings": [
"id": "media_type",
"details": "Invalid media type for data input: input_file -> image/png"
]
}
|
Validate Import | |||
---|---|---|---|
Validate import configuration records for recipe types, job types, and errors. | |||
POST /configuration/validation/ | |||
Content Type | application/json | |||
JSON Fields | |||
import | JSON Object | Required | The previously exported configuration to check. |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
warnings | Array | A list of warnings discovered during validation. | |
.id | String | An identifier for the warning. | |
.details | String | A human-readable description of the problem. | |
{
"warnings": [
"id": "media_type",
"details": "Invalid media type for data input: input_file -> image/png"
]
}
|
Export Download |
---|
Exports configuration records for recipe types, job types, and errors as a download attachment response. All the request parameters and response fields are identical to the normal export. (See Export) This is purely a convenience API for web applications to provide a Save As... download prompt to users. |
GET /configuration/download/ |
No Response
|
Import Upload |
---|
Imports configuration records for recipe types, job types, and errors using a multi-part form encoding. All the request parameters and response fields are identical to the normal import. (See Import) This is purely a convenience API for web applications to provide a Browse... file input to users. The API supports traditional file uploads using a form element like this: <form method="POST" enctype="multipart/form-data" action="SERVER/configuration/upload/">
<input type="file" name="import"></input>
<button type="submit">Import</button>
</form>
The API also supports more modern AJAX file uploads by providing the file name in the header: HTTP_X_FILE_NAME. |
POST /configuration/upload/ |
Product Services¶
These services provide access to information about products that Scale has produced.
Product List | |||
---|---|---|---|
Returns a list of all products | |||
GET /products/ | |||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
order | String | Optional | One or more fields to use when ordering the results. Duplicate it to multi-sort, (ex: order=file_name&order=created). Nested objects require a delimiter (ex: order=job_type__name). Prefix fields with a dash to reverse the sort, (ex: order=-created). |
job_type_id | Integer | Optional | Return only jobs with a given job type identifier. Duplicate it to filter by multiple values. |
job_type_name | String | Optional | Return only jobs with a given job type name. Duplicate it to filter by multiple values. |
job_type_category | String | Optional | Return only jobs with a given job type category. Duplicate it to filter by multiple values. |
is_operational | Boolean | Optional | Return only products flagged as operational status versus R&D. |
file_name | String | Optional | Return only products with a given file name. |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.id | Integer | The unique identifier of the model. Can be passed to the details API call. (See Product Details) | |
.workspace | JSON Object | The workspace that has stored the product. (See Workspace Details) | |
.file_name | String | The name of the product file. | |
.media_type | String | The IANA media type of the product file. | |
.file_size | Integer | The size of the product file in bytes. | |
.data_type | Array | List of strings describing the data type of the product. | |
.is_deleted | Boolean | Whether the product file has been deleted. | |
.uuid | String | A unique identifier that stays stable across multiple job execution runs. | |
.url | URL | The absolute URL to use for downloading the file. | |
.created | ISO-8601 Datetime | When the associated database model was initially created. | |
.deleted | ISO-8601 Datetime | When the product file was deleted. | |
.data_started | ISO-8601 Datetime | When collection of the underlying data file started. | |
.data_ended | ISO-8601 Datetime | When collection of the underlying data file ended. | |
.geometry | WKT String | The full geospatial geometry footprint of the product. | |
.center_point | WKT String | The central geospatial location of the product. | |
.meta_data | JSON Object | A dictionary of key/value pairs that describe product-specific attributes. | |
.countries | Array | A list of zero or more strings with the ISO3 country codes for countries contained in the geographic boundary of this file. | |
.last_modified | ISO-8601 Datetime | When the associated database model was last saved. | |
.is_operational | Boolean | Whether this product was produced by an operational job type or a job type still in research and development. | |
.is_published | Boolean | Whether the product file is currently published. | |
.published | ISO-8601 Datetime | When the product file was originally published by Scale. | |
.unpublished | ISO-8601 Datetime | When the product file was unpublished by Scale. | |
.job_type | JSON Object | The type of job that generated the product. (See Job Type Details) | |
.job | JSON Object | The job instance that generated the product. (See Job Details) | |
.job_exe | JSON Object | The specific job execution that generated the product. (See Job Execution Details) | |
{
"count": 55,
"next": null,
"previous": null,
"results": [
{
"id": 465,
"workspace": {
"id": 2,
"name": "Products"
},
"file_name": "my_file.kml",
"media_type": "application/vnd.google-earth.kml+xml",
"file_size": 100,
"data_type": [],
"is_deleted": false,
"uuid": "c8928d9183fc99122948e7840ec9a0fd",
"url": "http://host.com/file/path/my_file.kml",
"created": "1970-01-01T00:00:00Z",
"deleted": null,
"data_started": null,
"data_ended": null,
"geometry": null,
"center_point": null,
"meta_data": {...},
"countries": ["TCY", "TCT"],
"last_modified": "1970-01-01T00:00:00Z",
"is_operational": true,
"is_published": true,
"published": "1970-01-01T00:00:00Z",
"unpublished": null,
"job_type": {
"id": 8,
"name": "kml-footprint",
"version": "1.0.0",
"title": "KML Footprint",
"description": "Creates a KML file.",
"category": "footprint",
"author_name": null,
"author_url": null,
"is_system": false,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f0ac"
},
"job": {
"id": 47
},
"job_exe": {
"id": 49
}
},
...
]
}
|
Product Updates | |||
---|---|---|---|
Returns the product updates (published, unpublished, and deleted products) that have occurred in the given time range. | |||
GET /products/updates/ | |||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
order | String | Optional | One or more fields to use when ordering the results. Duplicate it to multi-sort, (ex: order=file_name&order=created). Nested objects require a delimiter (ex: order=job_type__name). Prefix fields with a dash to reverse the sort, (ex: order=-created). |
job_type_id | Integer | Optional | Return only jobs with a given job type identifier. Duplicate it to filter by multiple values. |
job_type_name | String | Optional | Return only jobs with a given job type name. Duplicate it to filter by multiple values. |
job_type_category | String | Optional | Return only jobs with a given job type category. Duplicate it to filter by multiple values. |
is_operational | Boolean | Optional | Return only products flagged as operational status versus R&D. |
file_name | String | Optional | Return only products with a given file name. |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.id | Integer | The unique identifier of the model. Can be passed to the details API call. (See Product Details) | |
.workspace | JSON Object | The workspace that has stored the product. (See Workspace Details) | |
.file_name | String | The name of the product file. | |
.media_type | String | The IANA media type of the product file. | |
.file_size | Integer | The size of the product file in bytes. | |
.data_type | Array | List of strings describing the data type of the product. | |
.is_deleted | Boolean | Whether the product file has been deleted. | |
.uuid | String | A unique identifier that stays stable across multiple job execution runs. | |
.url | URL | The absolute URL to use for downloading the file. | |
.created | ISO-8601 Datetime | When the associated database model was initially created. | |
.deleted | ISO-8601 Datetime | When the product file was deleted. | |
.data_started | ISO-8601 Datetime | When collection of the underlying data file started. | |
.data_ended | ISO-8601 Datetime | When collection of the underlying data file ended. | |
.geometry | WKT String | The full geospatial geometry footprint of the product. | |
.center_point | WKT String | The central geospatial location of the product. | |
.meta_data | JSON Object | A dictionary of key/value pairs that describe product-specific attributes. | |
.countries | Array | A list of zero or more strings with the ISO3 country codes for countries contained in the geographic boundary of this file. | |
.last_modified | ISO-8601 Datetime | When the associated database model was last saved. | |
.is_operational | Boolean | Whether this product was produced by an operational job type or a job type still in research and development. | |
.is_published | Boolean | Whether the product file is currently published. | |
.published | ISO-8601 Datetime | When the product file was originally published by Scale. | |
.unpublished | ISO-8601 Datetime | When the product file was unpublished by Scale. | |
.job_type | JSON Object | The type of job that generated the product. (See Job Type Details) | |
.job | JSON Object | The job instance that generated the product. (See Job Details) | |
.job_exe | JSON Object | The specific job execution that generated the product. (See Job Execution Details) | |
.update | JSON Object | Contains the details of this update. | |
..action | String | The product update that occurred. Choices: [PUBLISHED, UNPUBLISHED, DELETED]. | |
..when | ISO-8601 Datetime | When the action occurred. | |
.source_files | Array | List of source files involved in the creation of this product. (See Source File Details) | |
{
"count": 55,
"next": null,
"previous": null,
"results": [
{
"id": 465,
"workspace": {
"id": 2,
"name": "Products"
},
"file_name": "my_file.kml",
"media_type": "application/vnd.google-earth.kml+xml",
"file_size": 100,
"data_type": [],
"is_deleted": false,
"uuid": "c8928d9183fc99122948e7840ec9a0fd",
"url": "http://host.com/file/path/my_file.kml",
"created": "1970-01-01T00:00:00Z",
"deleted": null,
"data_started": null,
"data_ended": null,
"geometry": null,
"center_point": null,
"meta_data": {...},
"countries": ["TCY", "TCT"],
"last_modified": "1970-01-01T00:00:00Z",
"is_operational": true,
"is_published": true,
"published": "1970-01-01T00:00:00Z",
"unpublished": null,
"job_type": {
"id": 8,
"name": "kml-footprint",
"version": "1.0.0",
"title": "KML Footprint",
"description": "Creates a KML file.",
"category": "footprint",
"author_name": null,
"author_url": null,
"is_system": false,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f0ac"
},
"job": {
"id": 47
},
"job_exe": {
"id": 49
},
"update": {
"action": "PUBLISHED",
"when": "1970-01-01T00:00:00Z"
},
"source_files": [
{
"id": 464,
"workspace": {
"id": 2,
"name": "Raw Source"
},
"file_name": "my_file.h5",
"media_type": "image/x-hdf5-image",
"file_size": 100,
"data_type": [],
"is_deleted": false,
"uuid": "3d8e577bddb17db339eae0b3d9bcf180",
"url": "http://host.com/file/path/my_file.h5",
"created": "1970-01-01T00:00:00Z",
"deleted": null,
"data_started": null,
"data_ended": null,
"geometry": null,
"center_point": null,
"meta_data": {...},
"countries": ["TCY", "TCT"],
"last_modified": "1970-01-01T00:00:00Z",
"is_parsed": true,
"parsed": "1970-01-01T00:00:00Z"
}
]
},
...
]
}
|
Queue Services¶
These services provide access to information about the current and historical queue state, as well as allowing a user to place jobs and recipes on the queue for processing.
Job Load | |||
---|---|---|---|
Returns statistics about the current job load organized by job type. Jobs are counted when they are in the PENDING, QUEUED, and RUNNING states. NOTE: Time range must be within a one month period (31 days). | |||
GET /load/ | |||
Query Parameters | |||
started | ISO-8601 Datetime | Required | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). Defaults to the past 1 week. |
ended | ISO-8601 Datetime | Required | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
job_type_id | Integer | Optional | Count only jobs with a given job type identifier. Duplicate it to filter by multiple values. |
job_type_name | String | Optional | Count only jobs with a given job type name. Duplicate it to filter by multiple values. |
job_type_category | String | Optional | Count only jobs with a given job type category. Duplicate it to filter by multiple values. |
job_type_priority | Integer | Optional | Count only jobs with a given job type priority. Duplicate it to filter by multiple values. |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.time | ISO-8601 Datetime | When the counts were actually recorded. | |
.pending_count | Integer | The number of jobs in the pending state at the measured time. | |
.queued_count | Integer | The number of jobs in the queued state at the measured time. | |
.running_count | Integer | The number of jobs in the running state at the measured time. | |
{
"count": 28,
"next": null,
"previous": null,
"results": [
{
"time": "2015-10-21T00:00:00Z",
"pending_count": 1,
"queued_count": 0,
"running_count": 0
},
...
]
}
|
Get Queue Status | ||
---|---|---|
Returns the current status of the queue by grouping the queued jobs by their types | ||
GET /queue/status/ | ||
Successful Response | ||
Status | 200 OK | |
Content Type | application/json | |
JSON Fields | ||
queue_status | List | List of job types on the queue with meta-data |
count | Integer | The number of jobs of this type on the queue |
longest_queued | ISO-8601 Datetime | When the job that has been queued the longest of this type was queued |
job_type_name | String | The name of this job type |
job_type_version | String | The version of this job type |
highest_priority | Integer | The highest priority of any job of this type |
is_job_type_paused | Boolean | If this job type has been paused (jobs of this type won’t be scheduled) |
{
"queue_status": [
{
"count": 19,
"longest_queued": "1970-01-01T00:00:00.000Z",
"job_type_name": "My Job Type",
"job_type_version": "1.0",
"highest_priority": 1,
"is_job_type_paused": false
},
...
]
}
|
Queue New Job | ||
---|---|---|
Creates a new job and places it onto the queue | ||
POST /queue/new-job/ | ||
Content Type | application/json | |
JSON Fields | ||
job_type_id | Integer | The ID of the job type for the new job |
job_data | JSON Object | JSON defining the data to run the job on, see Job Data |
{
"job_type_id": 1234,
"job_data": {
"version": "1.0",
"input_data": [
{
"name": "Param 1",
"value": "HELLO"
},
{
"name": "Param 2",
"file_id": 9876
}
],
"output_data": [
{
"name": "Param 3",
"workspace_id": 15
}
]
}
}
|
||
Successful Response | ||
Status | 201 CREATED | |
Location | URL pointing to the details for the newly queued job execution | |
Content Type | application/json | |
JSON Fields | ||
JSON Object | All fields are the same as the job details model. The status will always be QUEUED and a new job_exe will be included. (See Job Details) | |
{
"id": 15096,
"job_type": {
"id": 8,
"name": "kml-footprint",
"version": "1.0.0",
"title": "KML Footprint",
"description": "Creates a KML representation of the data",
"is_system": false,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f0ac",
"uses_docker": false,
"docker_privileged": false,
"docker_image": null,
"priority": 2,
"timeout": 600,
"max_tries": 1,
"cpus_required": 0.5,
"mem_required": 128.0,
"disk_out_const_required": 0.0,
"disk_out_mult_required": 0.0,
"created": "2015-06-01T00:00:00Z",
"archived": null,
"paused": null,
"last_modified": "2015-06-01T00:00:00Z"
},
"job_type_rev": {
"id": 5,
"job_type": {
"id": 8
},
"revision_num": 1,
"interface": {
"input_data": [
{
"type": "file",
"name": "input_file"
}
],
"output_data": [
{
"media_type": "application/vnd.google-earth.kml+xml",
"type": "file",
"name": "output_file"
}
],
"version": "1.0",
"command": "/usr/local/bin/python2.7 /app/parser/manage.py create_footprint_kml",
"command_arguments": "${input_file} ${job_output_dir}"
},
"created": "2015-11-06T00:00:00Z"
},
"event": {
"id": 10278,
"type": "PARSE",
"rule": {
"id": 8,
"type": "PARSE",
"is_active": true,
"created": "2015-08-28T18:31:29.282Z",
"archived": null,
"last_modified": "2015-08-28T18:31:29.282Z"
},
"occurred": "2015-09-01T17:27:31.467Z"
},
"error": null,
"status": "COMPLETED",
"priority": 210,
"num_exes": 1,
"timeout": 1800,
"max_tries": 3,
"cpus_required": 1.0,
"mem_required": 15360.0,
"disk_in_required": 2.0,
"disk_out_required": 16.0,
"created": "2015-08-28T17:55:41.005Z",
"queued": "2015-08-28T17:56:41.005Z",
"started": "2015-08-28T17:57:41.005Z",
"ended": "2015-08-28T17:58:41.005Z",
"last_status_change": "2015-08-28T17:58:45.906Z",
"last_modified": "2015-08-28T17:58:46.001Z",
"data": {
"input_data": [
{
"name": "input_file",
"file_id": 8480
}
],
"version": "1.0",
"output_data": [
{
"name": "output_file",
"workspace_id": 2
}
]
},
"results": {
"output_data": [
{
"name": "output_file",
"file_id": 8484
}
],
"version": "1.0"
},
"input_files": [
{
"id": 2,
"workspace": {
"id": 1,
"name": "Raw Source"
},
"file_name": "input_file.txt",
"media_type": "text/plain",
"file_size": 1234,
"data_type": [],
"is_deleted": false,
"uuid": "c8928d9183fc99122948e7840ec9a0fd",
"url": "http://host.com/input_file.txt",
"created": "2015-09-10T15:24:53.962Z",
"deleted": null,
"data_started": "2015-09-10T14:50:49Z",
"data_ended": "2015-09-10T14:51:05Z",
"geometry": null,
"center_point": null,
"meta_data": {...}
"last_modified": "2015-09-10T15:25:02.808Z"
}
],
"recipes": [
{
"id": 4832,
"recipe_type": {
"id": 6,
"name": "Recipe",
"version": "1.0.0",
"description": "Recipe description"
},
"event": {
"id": 7,
"type": "PARSE",
"rule": {
"id": 2
},
"occurred": "2015-08-28T17:58:45.280Z"
},
"created": "2015-09-01T20:32:20.912Z",
"completed": "2015-09-01T20:35:20.912Z",
"last_modified": "2015-09-01T20:35:20.912Z"
}
],
"job_exes": [
{
"id": 14552,
"status": "COMPLETED",
"command_arguments": "${input_file} ${job_output_dir}",
"timeout": 1800,
"pre_started": "2015-09-01T17:27:32.435Z",
"pre_completed": "2015-09-01T17:27:34.346Z",
"pre_exit_code": null,
"job_started": "2015-09-01T17:27:42.437Z",
"job_completed": "2015-09-01T17:27:46.762Z",
"job_exit_code": null,
"post_started": "2015-09-01T17:27:47.246Z",
"post_completed": "2015-09-01T17:27:49.461Z",
"post_exit_code": null,
"created": "2015-09-01T17:27:31.753Z",
"queued": "2015-09-01T17:27:31.716Z",
"started": "2015-09-01T17:27:32.022Z",
"ended": "2015-09-01T17:27:49.461Z",
"last_modified": "2015-09-01T17:27:49.606Z",
"job": {
"id": 15586
},
"node": {
"id": 1
},
"error": null
}
],
"products": [
{
"id": 8484,
"workspace": {
"id": 2,
"name": "Products"
},
"file_name": "file.kml",
"media_type": "application/vnd.google-earth.kml+xml",
"file_size": 1234,
"data_type": [],
"is_deleted": false,
"uuid": "c8928d9183fc99122948e7840ec9a0fd",
"url": "http://host.com/file/path/my_file.kml",
"created": "2015-09-01T17:27:48.477Z",
"deleted": null,
"data_started": null,
"data_ended": null,
"geometry": null,
"center_point": null,
"meta_data": {},
"last_modified": "2015-09-01T17:27:49.639Z",
"is_operational": true,
"is_published": true,
"published": "2015-09-01T17:27:49.461Z",
"unpublished": null,
"job_type": {
"id": 8
},
"job": {
"id": 35
},
"job_exe": {
"id": 19
}
}
]
}
|
Queue New Recipe | ||
---|---|---|
Creates a new recipe and places it onto the queue | ||
POST /queue/new-recipe/ | ||
Content Type | application/json | |
JSON Fields | ||
recipe_type_id | Integer | The ID of the recipe type to queue |
recipe_data | JSON Object | Defines the data to run the recipe, see Recipe Data |
{
"recipe_type_id": 1234,
"recipe_data": {
"version": "1.0",
"input_data": [
{
"name": "image",
"file_id": 1234
},
{
"name": "georeference_data",
"file_id": 1235
}
],
"workspace_id": 12
}
}
|
||
Successful Response | ||
Status | 201 CREATED | |
Location | URL pointing to the details for the newly queued recipe data | |
Content Type | application/json | |
JSON Fields | ||
JSON Object | All fields are the same as the recipe details model. (See Recipe Details) | |
{
"id": 72,
"recipe_type": {
"id": 1,
"name": "MyRecipe",
"version": "1.0.0",
"description": "This is a description of the recipe",
"is_active": true,
"definition": {
"input_data": [
{
"media_types": [
"image/png"
],
"type": "file",
"name": "input_file"
}
],
"version": "1.0",
"jobs": [
{
"recipe_inputs": [
{
"job_input": "input_file",
"recipe_input": "input_file"
}
],
"name": "kml",
"job_type": {
"name": "kml-footprint",
"version": "1.2.3"
}
}
]
},
"created": "2015-06-15T19:03:26.346Z",
"last_modified": "2015-06-15T19:03:26.346Z",
"archived": null
},
"event": {
"id": 7,
"type": "PARSE",
"rule": {
"id": 8,
"type": "PARSE",
"is_active": true,
"configuration": {
"version": "1.0",
"condition": {
"media_type": "image/png",
"data_types": []
},
"data": {
"input_data_name": "input_file",
"workspace_name": "products"
}
}
},
"occurred": "2015-08-28T19:03:59.054Z",
"description": {
"file_name": "data-file.png",
"version": "1.0",
"parse_id": 1
}
},
"created": "2015-06-15T19:03:26.346Z",
"completed": "2015-06-15T19:05:26.346Z",
"last_modified": "2015-06-15T19:05:26.346Z"
"data": {
"input_data": [
{
"name": "input_file",
"file_id": 4,
}
],
"version": "1.0"
"workspace_id": 2
}
"input_files": [
{
"id": 4,
"workspace": {
"id": 1,
"name": "Raw Source"
},
"file_name": "input_file.txt",
"media_type": "text/plain",
"file_size": 1234,
"data_type": [],
"is_deleted": false,
"uuid": "c8928d9183fc99122948e7840ec9a0fd",
"url": "http://host.com/input_file.txt",
"created": "2015-09-10T15:24:53.962Z",
"deleted": null,
"data_started": "2015-09-10T14:50:49Z",
"data_ended": "2015-09-10T14:51:05Z",
"geometry": null,
"center_point": null,
"meta_data": {...}
"last_modified": "2015-09-10T15:25:02.808Z"
}
],
"jobs": [
{
"job_name": "kml",
"job": {
"id": 7,
"job_type": {
"id": 8,
"name": "kml-footprint",
"version": "1.2.3",
"title": "KML Footprint",
"description": "Creates a KML footprint",
"category": "footprint",
"author_name": null,
"author_url": null,
"is_system": false,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f0ac"
},
"job_type_rev": {
"id": 5,
"job_type": {
"id": 8
},
"revision_num": 1
},
"event": {
"id": 7,
"type": "PARSE",
"rule": {
"id": 8
},
"occurred": "2015-08-28T19:03:59.054Z"
},
"error": null,
"status": "COMPLETED",
"priority": 210,
"num_exes": 1,
"timeout": 1800,
"max_tries": 3,
"cpus_required": 1.0,
"mem_required": 15360.0,
"disk_in_required": 2.0,
"disk_out_required": 16.0,
"created": "2015-08-28T17:55:41.005Z",
"queued": "2015-08-28T17:56:41.005Z",
"started": "2015-08-28T17:57:41.005Z",
"ended": "2015-08-28T17:58:41.005Z",
"last_status_change": "2015-08-28T17:58:45.906Z",
"last_modified": "2015-08-28T17:58:46.001Z"
}
},
...
]
}
|
Requeue Jobs | |||
---|---|---|---|
Increases the maximum failure allowance for existing jobs and puts them back on the queue. | |||
POST /queue/requeue-jobs/ | |||
Content Type | application/json | ||
JSON Fields | |||
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
status | String | Optional | Queue only jobs with a status matching these strings. Choices: [CANCELED, FAILED]. |
job_ids | Array[Integer] | Optional | Queue only jobs with a given identifier. |
job_type_ids | Array[Integer] | Optional | Queue only jobs with a given job type identifier. |
job_type_names | Array[String] | Optional | Queue only jobs with a given job type name. |
job_type_categories | Array[String] | Optional | Queue only jobs with a given job type category. |
priority | Integer | Optional | Change the priority of matching jobs when adding them to the queue. Defaults to jobs current priority, lower number is higher priority. |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
JSON Object | All fields are the same as the jobs model. The status will be PENDING or BLOCKED if the job has never been queued. The status will be QUEUED if the job has been previously queued. (See Job List) | ||
{
"count": 68,
"next": null,
"previous": null,
"results": [
{
"id": 3,
"job_type": {
"id": 1,
"name": "scale-ingest",
"version": "1.0",
"title": "Scale Ingest",
"description": "Ingests a source file into a workspace",
"is_system": true,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f013"
},
"job_type_rev": {
"id": 5,
"job_type": {
"id": 1
},
"revision_num": 1
},
"event": {
"id": 3,
"type": "STRIKE_TRANSFER",
"rule": null,
"occurred": "2015-08-28T17:57:24.261Z"
},
"error": null,
"status": "QUEUED",
"priority": 10,
"num_exes": 1,
"timeout": 1800,
"max_tries": 3,
"cpus_required": 1.0,
"mem_required": 64.0,
"disk_in_required": 0.0,
"disk_out_required": 64.0,
"created": "2015-08-28T17:55:41.005Z",
"queued": "2015-08-28T17:56:41.005Z",
"started": "2015-08-28T17:57:41.005Z",
"ended": "2015-08-28T17:58:41.005Z",
"last_status_change": "2015-08-28T17:58:45.906Z",
"last_modified": "2015-08-28T17:58:46.001Z"
},
...
]
}
|
Recipe Services¶
These services provide access to information about recipes.
Recipe List | |||
---|---|---|---|
Returns a list of all recipes. | |||
GET /recipes/ | |||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
order | String | Optional | One or more fields to use when ordering the results. Duplicate it to multi-sort, (ex: order=name&order=version). Prefix fields with a dash to reverse the sort, (ex: order=-name). |
type_id | Integer | Optional | Return only recipes with a given recipe type identifier. Duplicate it to filter by multiple values. |
type_name | String | Optional | Return only recipes with a given recipe type name. Duplicate it to filter by multiple values. |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.id | Integer | The unique identifier of the model. Can be passed to the details API call. (See Recipe Details) | |
.recipe_type | JSON Object | The recipe type that is associated with the recipe. This represents the latest version of the definition. (See Recipe Type Details) | |
.recipe_type_rev | JSON Object | The recipe type revision that is associated with the recipe. This represents the definition at the time the recipe was scheduled. (See Recipe Type Revision Details) | |
.event | JSON Object | The trigger event that is associated with the recipe. (See Trigger Event Details) | |
.created | ISO-8601 Datetime | When the associated database model was initially created. | |
.completed | ISO-8601 Datetime | When every job in the recipe was completed successfully. This field will remain null if a job in the recipe is blocked or failed. | |
.last_modified | ISO-8601 Datetime | When the associated database model was last saved. | |
{
"count": 15,
"next": null,
"previous": null,
"results": [
{
"id": 72,
"recipe_type": {
"id": 1,
"name": "my-recipe",
"version": "1.0.0",
"description": "Does some stuff"
},
"recipe_type_rev": {
"id": 6,
"recipe_type": {
"id": 1
},
"revision_num": 3
},
"event": {
"id": 7,
"type": "PARSE",
"rule": {
"id": 8,
},
"occurred": "2015-06-15T19:03:26.346Z"
},
"created": "2015-06-15T19:03:26.346Z",
"completed": "2015-06-15T19:05:26.346Z",
"last_modified": "2015-06-15T19:05:26.346Z"
},
...
]
}
|
Recipe Details | ||
---|---|---|
Returns a specific recipe and all its related model information including definition, event, data, and jobs. | ||
|
||
Successful Response | ||
Status | 200 OK | |
Content Type | application/json | |
JSON Fields | ||
id | Integer | The unique identifier of the model. |
recipe_type | JSON Object | The recipe type that is associated with the recipe. (See Recipe Type Details) |
.recipe_type_rev | JSON Object | The recipe type revision that is associated with the recipe. This represents the definition at the time the recipe was scheduled. (See Recipe Type Revision Details) |
event | JSON Object | The trigger event that is associated with the recipe. (See Trigger Event Details) |
created | ISO-8601 Datetime | When the associated database model was initially created. |
completed | ISO-8601 Datetime | When every job in the recipe was completed successfully. This field will remain null if a job in the recipe is blocked or failed. |
last_modified | ISO-8601 Datetime | When the associated database model was last saved. |
data | JSON Object | JSON description defining the data used to execute a recipe instance. (See Recipe Data Specification Version 1.0) |
input_files | JSON Object | A list of files that the recipe used as input. (See Scale File Details) |
jobs | Array | The jobs associated with this recipe. |
.job_name | String | The name of the job for this recipe. |
.job | JSON Object | The job that is associated with the recipe. (See Job Details) |
{
"id": 72,
"recipe_type": {
"id": 1,
"name": "MyRecipe",
"version": "1.0.0",
"description": "This is a description of the recipe",
"is_active": true,
"definition": {
"input_data": [
{
"media_types": [
"image/png"
],
"type": "file",
"name": "input_file"
}
],
"version": "1.0",
"jobs": [
{
"recipe_inputs": [
{
"job_input": "input_file",
"recipe_input": "input_file"
}
],
"name": "kml",
"job_type": {
"name": "kml-footprint",
"version": "1.2.3"
}
}
]
},
"created": "2015-06-15T19:03:26.346Z",
"last_modified": "2015-06-15T19:03:26.346Z",
"archived": null
},
"recipe_type_rev": {
"id": 5,
"recipe_type": {
"id": 1
},
"revision_num": 3,
"definition": {
"input_data": [
{
"media_types": [
"image/png"
],
"type": "file",
"name": "input_file"
}
],
"version": "1.0",
"jobs": [
{
"recipe_inputs": [
{
"job_input": "input_file",
"recipe_input": "input_file"
}
],
"name": "kml",
"job_type": {
"name": "kml-footprint",
"version": "1.2.3"
}
}
]
},
"created": "2015-11-06T19:44:09.989Z"
},
"event": {
"id": 7,
"type": "PARSE",
"rule": {
"id": 8,
"type": "PARSE",
"name": "parse-png",
"is_active": true,
"configuration": {
"version": "1.0",
"data": {
"workspace_name": "products",
"input_data_name": "input_file"
},
"condition": {
"media_type": "image/png",
"data_types": []
}
}
},
"occurred": "2015-08-28T19:03:59.054Z",
"description": {
"file_name": "data-file.png",
"version": "1.0",
"parse_id": 1
}
},
"created": "2015-06-15T19:03:26.346Z",
"completed": "2015-06-15T19:05:26.346Z",
"last_modified": "2015-06-15T19:05:26.346Z"
"data": {
"input_data": [
{
"name": "input_file",
"file_id": 4,
}
],
"version": "1.0"
"workspace_id": 2
}
"input_files": [
{
"id": 4,
"workspace": {
"id": 1,
"name": "Raw Source"
},
"file_name": "input_file.txt",
"media_type": "text/plain",
"file_size": 1234,
"data_type": [],
"is_deleted": false,
"uuid": "c8928d9183fc99122948e7840ec9a0fd",
"url": "http://host.com/input_file.txt",
"created": "2015-09-10T15:24:53.962Z",
"deleted": null,
"data_started": "2015-09-10T14:50:49Z",
"data_ended": "2015-09-10T14:51:05Z",
"geometry": null,
"center_point": null,
"meta_data": {...}
"last_modified": "2015-09-10T15:25:02.808Z"
}
],
"jobs": [
{
"job_name": "kml",
"job": {
"id": 7,
"job_type": {
"id": 8,
"name": "kml-footprint",
"version": "1.2.3",
"title": "KML Footprint",
"description": "Creates a KML footprint",
"category": "footprint",
"author_name": null,
"author_url": null,
"is_system": false,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f0ac"
},
"job_type_rev": {
"id": 5,
"job_type": {
"id": 8
},
"revision_num": 1,
"interface": {...},
"created": "2015-11-06T21:30:34.622Z"
},
"event": {
"id": 7,
"type": "PARSE",
"rule": {
"id": 8
},
"occurred": "2015-08-28T19:03:59.054Z"
},
"error": null,
"status": "COMPLETED",
"priority": 210,
"num_exes": 1,
"timeout": 1800,
"max_tries": 3,
"cpus_required": 1.0,
"mem_required": 15360.0,
"disk_in_required": 2.0,
"disk_out_required": 16.0,
"created": "2015-08-28T17:55:41.005Z",
"queued": "2015-08-28T17:56:41.005Z",
"started": "2015-08-28T17:57:41.005Z",
"ended": "2015-08-28T17:58:41.005Z",
"last_status_change": "2015-08-28T17:58:45.906Z",
"last_modified": "2015-08-28T17:58:46.001Z"
}
},
...
]
}
|
Recipe Types Services¶
These services provide access to information about recipe types.
Recipe Type List | |||
---|---|---|---|
Returns recipe types and basic recipe type information | |||
GET /recipe-types/ | |||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
order | String | Optional | One or more fields to use when ordering the results. Duplicate it to multi-sort, (ex: order=name&order=version). Prefix fields with a dash to reverse the sort, (ex: order=-name). |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.id | Integer | The unique identifier of the model. Can be passed to the details API call. (See Recipe Type Details) | |
.name | String | The stable name of recipe job type used for queries. | |
.version | String | The version of the recipe type. | |
.title | String | The human readable display name of the recipe type. | |
.description | String | An optional description of the recipe type. | |
.is_active | Boolean | Whether the recipe type is active (false once recipe type is archived). | |
.definition | JSON Object | JSON description defining the interface for running a recipe of this type. (See Recipe Definition Specification Version 1.0) | |
.revision_num | Integer | The current revision number of the recipe type, incremented for each edit. | |
.created | ISO-8601 Datetime | When the associated database model was initially created. | |
.last_modified | ISO-8601 Datetime | When the associated database model was last saved. | |
.archived | ISO-8601 Datetime | When the recipe type was archived (no longer active). | |
.trigger_rule | JSON Object | The linked trigger rule that automatically invokes the recipe type. (See Trigger Rule Details) | |
{
"count": 9,
"next": null,
"previous": null,
"results": [
{
"id": 1,
"name": "my-recipe",
"version": "1.0.0",
"title": "My Recipe",
"description": "This is a description of the recipe",
"is_active": true,
"definition": {
"input_data": [
{
"media_types": [
"image/png"
],
"type": "file",
"name": "input_file"
}
],
"version": "1.0",
"jobs": [
{
"recipe_inputs": [
{
"job_input": "input_file",
"recipe_input": "input_file"
}
],
"name": "nitf",
"job_type": {
"name": "nitf-converter",
"version": "1.2.3"
}
}
]
},
"revision_num": 1,
"created": "2015-06-15T19:03:26.346Z",
"last_modified": "2015-06-15T19:03:26.346Z",
"archived": null,
"trigger_rule": {
"id": 12
}
},
...
]
}
|
Create Recipe Type | |||
---|---|---|---|
Creates a new recipe type with associated definition | |||
POST /recipe-types/ | |||
Content Type | application/json | ||
JSON Fields | |||
name | String | Required | The stable name of recipe type used for queries. |
version | String | Required | The version of the recipe type. |
title | String | Optional | The human-readable name of the recipe type. |
description | String | Optional | An optional description of the recipe type. |
definition | JSON Object | Required | JSON description of the interface for running a recipe of this type. (See Recipe Definition Specification Version 1.0) |
trigger_rule | JSON Object | Optional | The linked trigger rule that automatically invokes the recipe type. The type and configuration fields are required if setting a rule. The is_active field is optional and can be used to pause the recipe. (See Trigger Rule Details) |
{
"name": "my-recipe",
"version": "1.0",
"title": "My Recipe",
"description": "This is a description of the recipe",
"definition": {
"input_data": [
{
"media_types": ["text/plain"],
"type": "file",
"name": "input_file"
}
],
"jobs": [
{
"recipe_inputs": [
{
"job_input": "input_file",
"recipe_input": "input_file"
}
],
"name": "MyJob1",
"job_type": {
"name": "my-job1",
"version": "1.2.3"
}
},
{
"recipe_inputs": [
{
"job_input": "input_file",
"recipe_input": "input_file"
}
],
"name": "MyJob2",
"job_type": {
"name": "my-job2",
"version": "4.5.6"
}
}
],
},
"trigger_rule": {
"type": "PARSE",
"is_active": true,
"configuration": {
"version": "1.0",
"condition": {
"media_type": "text/plain",
"data_types": []
},
"data": {
"input_data_name": "input_file",
"workspace_name": "rs"
}
}
}
}
|
|||
Successful Response | |||
Status | 201 CREATED | ||
Content Type | application/json | ||
JSON Fields | |||
JSON Object | All fields are the same as the recipe type details model. (See Recipe Type Details) | ||
{
"id": 1,
"name": "my-recipe",
"version": "1.0.0",
"title": "My Recipe",
"description": "This is a description of the recipe",
"is_active": true,
"definition": {
"input_data": [
{
"media_types": [
"image/png"
],
"type": "file",
"name": "input_file"
}
],
"version": "1.0",
"jobs": [
{
"recipe_inputs": [
{
"job_input": "input_file",
"recipe_input": "input_file"
}
],
"name": "my_job_type",
"job_type": {
"name": "my-job-type",
"version": "1.2.3"
}
}
]
},
"revision_num": 1,
"created": "2015-06-15T19:03:26.346Z",
"last_modified": "2015-06-15T19:03:26.346Z",
"archived": null,
"trigger_rule": {
"id": 12,
"type": "PARSE",
"name": "my-job-type-recipe",
"is_active": true,
"configuration": {
"version": "1.0",
"data": {
"workspace_name": "products",
"input_data_name": "input_file"
},
"condition": {
"media_type": "image/png",
"data_types": [
"My-Type"
]
}
}
},
"job_types": [
{
"id": 35,
"name": "my-job-type",
"version": "1.2.3",
"title": "Job Type",
"description": "This is a job type",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": false,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f1c5",
"interface": {
"input_data": [
{
"media_types": [
"image/png"
],
"type": "file",
"name": "input_file"
}
],
"version": "1.0",
"command": "command_to_run.sh",
"output_data": [
{
"media_type": "image/png",
"type": "file",
"name": "my_file_name"
}
],
"command_arguments": "${input_file} ${job_output_dir}"
}
},
...
]
}
|
Validate Recipe Type | ||||
---|---|---|---|---|
Validates a new recipe type without actually saving it | ||||
POST /recipe-types/validation/ | ||||
Content Type | application/json | |||
JSON Fields | ||||
name | String | Required | The stable name of recipe job type used for queries. | |
version | String | Required | The version of the recipe type. | |
title | String | Optional | The human-readable name of the recipe type. | |
description | String | Optional | An optional description of the recipe type. | |
definition | JSON Object | Required | JSON description defining the interface for running the recipe type. (See Recipe Definition Specification Version 1.0) | |
trigger_rule | JSON Object | Optional | The linked trigger rule that automatically invokes the recipe type. The type and configuration fields are required if setting a rule. The is_active field is optional and can be used to pause the recipe. (See Trigger Rule Details) | |
{
"name": "my-recipe",
"version": "1.0",
"title": "My Recipe",
"description": "This is a description of the recipe",
"input_data": [
{
"media_types": ["text/plain"],
"type": "file",
"name": "input_file"
}
],
"jobs": [
{
"recipe_inputs": [
{
"job_input": "input_file",
"recipe_input": "input_file"
}
],
"name": "MyJob1",
"job_type": {
"name": "my-job1",
"version": "1.2.3"
}
},
{
"recipe_inputs": [
{
"job_input": "input_file",
"recipe_input": "input_file"
}
],
"name": "MyJob2",
"job_type": {
"name": "my-job2",
"version": "4.5.6"
}
}
],
"trigger_rule": {
"type": "PARSE",
"is_active": true,
"configuration": {
"version": "1.0",
"condition": {
"media_type": "text/plain",
"data_types": []
},
"data": {
"input_data_name": "input_file",
"workspace_name": "rs"
}
}
}
}
|
||||
Successful Response | ||||
Status | 200 OK | |||
Content Type | application/json | |||
JSON Fields | ||||
warnings | Array | A list of warnings discovered during validation. | ||
.id | String | An identifier for the warning. | ||
.details | String | A human-readable description of the problem. | ||
{
"warnings": [
"id": "media_type",
"details": "Invalid media type for data input: input_file -> image/png"
]
}
|
Recipe Type Details | ||
---|---|---|
Returns a specific recipe type and all its related model information. | ||
|
||
Successful Response | ||
Status | 200 OK | |
Content Type | application/json | |
JSON Fields | ||
id | Integer | The unique identifier of the model. |
name | String | The human-readable name of the recipe type. |
version | String | The version of the recipe type. |
description | String | An optional description of the recipe type. |
is_active | Boolean | Whether the recipe type is active (false once recipe type is archived). |
definition | JSON Object | JSON description defining the interface for running a recipe of this type. (See Recipe Definition Specification Version 1.0) |
revision_num | Integer | The current revision number of the recipe type, incremented for each edit. |
created | ISO-8601 Datetime | When the associated database model was initially created. |
last_modified | ISO-8601 Datetime | When the associated database model was last saved. |
archived | ISO-8601 Datetime | When the recipe type was archived (no longer active). |
trigger_rule | JSON Object | The associated trigger rule that automatically invokes this recipe type. (See Trigger Rule Details) |
job_types | Array | List of all job_types that are referenced by this recipe type’s definition (See Job Type Details) |
{
"id": 1,
"name": "my-recipe",
"version": "1.0.0",
"title": "My Recipe",
"description": "This is a description of the recipe",
"is_active": true,
"definition": {
"input_data": [
{
"media_types": [
"image/png"
],
"type": "file",
"name": "input_file"
}
],
"version": "1.0",
"jobs": [
{
"recipe_inputs": [
{
"job_input": "input_file",
"recipe_input": "input_file"
}
],
"name": "my_job_type",
"job_type": {
"name": "my-job-type",
"version": "1.2.3"
}
}
]
},
"revision_num": 1,
"created": "2015-06-15T19:03:26.346Z",
"last_modified": "2015-06-15T19:03:26.346Z",
"archived": null,
"trigger_rule": {
"id": 12,
"type": "PARSE",
"name": "my-job-type-recipe",
"is_active": true,
"configuration": {
"version": "1.0",
"data": {
"workspace_name": "products",
"input_data_name": "input_file"
},
"condition": {
"media_type": "image/png",
"data_types": [
"My-Type"
]
}
}
},
"job_types": [
{
"id": 35,
"name": "my-job-type",
"version": "1.2.3",
"title": "Job Type",
"description": "This is a job type",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": false,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f1c5",
"interface": {
"input_data": [
{
"media_types": [
"image/png"
],
"type": "file",
"name": "input_file"
}
],
"version": "1.0",
"command": "command_to_run.sh",
"output_data": [
{
"media_type": "image/png",
"type": "file",
"name": "my_file_name"
}
],
"command_arguments": "${input_file} ${job_output_dir}"
}
},
...
]
}
|
Edit Recipe Type | |||
---|---|---|---|
Edits an existing recipe type with associated definition | |||
|
|||
Content Type | application/json | ||
JSON Fields | |||
title | String | Optional | The human-readable name of the recipe type. |
description | String | Optional | An optional description of the recipe type. |
definition | JSON Object | Optional | JSON description of the interface for running a recipe of this type. (See Recipe Definition Specification Version 1.0) |
trigger_rule | JSON Object | Optional | The linked trigger rule that automatically invokes the recipe type. The type and configuration fields are required if setting a rule. The is_active field is optional and can be used to pause the recipe. Set this field to null to remove the existing trigger rule. (See Trigger Rule Details) |
{
"title": "My Recipe",
"description": "This is a description of the recipe",
"definition": {
"input_data": [
{
"media_types": ["text/plain"],
"type": "file",
"name": "input_file"
}
],
"jobs": [
{
"recipe_inputs": [
{
"job_input": "input_file",
"recipe_input": "input_file"
}
],
"name": "MyJob1",
"job_type": {
"name": "my-job1",
"version": "1.2.3"
}
},
{
"recipe_inputs": [
{
"job_input": "input_file",
"recipe_input": "input_file"
}
],
"name": "MyJob2",
"job_type": {
"name": "my-job2",
"version": "4.5.6"
}
}
],
},
"trigger_rule": {
"type": "PARSE",
"is_active": true,
"configuration": {
"version": "1.0",
"condition": {
"media_type": "text/plain",
"data_types": []
},
"data": {
"input_data_name": "input_file",
"workspace_name": "rs"
}
}
}
}
|
|||
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
JSON Object | All fields are the same as the recipe type details model. (See Recipe Type Details) | ||
{
"id": 1,
"name": "my-recipe",
"version": "1.0.0",
"title": "My Recipe",
"description": "This is a description of the recipe",
"is_active": true,
"definition": {
"input_data": [
{
"media_types": [
"image/png"
],
"type": "file",
"name": "input_file"
}
],
"version": "1.0",
"jobs": [
{
"recipe_inputs": [
{
"job_input": "input_file",
"recipe_input": "input_file"
}
],
"name": "my_job_type",
"job_type": {
"name": "my-job-type",
"version": "1.2.3"
}
}
]
},
"revision_num": 2,
"created": "2015-06-15T19:03:26.346Z",
"last_modified": "2015-06-15T19:03:26.346Z",
"archived": null,
"trigger_rule": {
"id": 12,
"type": "PARSE",
"name": "my-job-type-recipe",
"is_active": true,
"configuration": {
"version": "1.0",
"data": {
"workspace_name": "products",
"input_data_name": "input_file"
},
"condition": {
"media_type": "image/png",
"data_types": [
"My-Type"
]
}
}
},
"job_types": [
{
"id": 35,
"name": "my-job-type",
"version": "1.2.3",
"title": "Job Type",
"description": "This is a job type",
"category": "system",
"author_name": null,
"author_url": null,
"is_system": false,
"is_long_running": false,
"is_active": true,
"is_operational": true,
"is_paused": false,
"icon_code": "f1c5",
"interface": {
"input_data": [
{
"media_types": [
"image/png"
],
"type": "file",
"name": "input_file"
}
],
"version": "1.0",
"command": "command_to_run.sh",
"output_data": [
{
"media_type": "image/png",
"type": "file",
"name": "my_file_name"
}
],
"command_arguments": "${input_file} ${job_output_dir}"
}
},
...
]
}
|
Scale File Services¶
These services provide access to information about general files that are being tracked by Scale.
Scheduler Services¶
These services provide access to information about the scheduler. There should be one and only one scheduler entry in the database and it is used to store global data like paused status.
Get Scheduler | ||
---|---|---|
Returns data for the scheduler | ||
GET /scheduler/ | ||
Successful Response | ||
Status | 200 OK | |
Content Type | application/json | |
JSON Fields | ||
is_paused | Boolean | True if the scheduler is paused. This functions like individually pausing all nodes but maintains separated state so toggling this back to unpaused results in the previous individual node pause state. |
{
"is_paused": False,
}
|
||
Error Responses | ||
Status | 500 INTERNAL SERVER ERROR | |
Content Type | text/plain | |
An internal error occurred, often this will indicate a missing database entry. |
Update Scheduler | ||
---|---|---|
Update one or more fields for the scheduler. | ||
|
||
Content Type | application/json | |
JSON Fields | ||
is_paused | Boolean | (Optional) True if the scheduler should be paused, false to resume. |
Successful Response | ||
Status | 201 CREATED | |
Location | URL pointing to the scheduler information (should be the same as the request URL) | |
Content Type | application/json | |
Response format is identical to GET but contains the updated data. | ||
JSON Fields | ||
is_paused | Boolean | True if the scheduler is paused. This functions like individually pausing all nodes but maintains separated state so toggling this back to unpaused results in the previous individual node pause state. |
Error Responses | ||
Status | 400 BAD REQUEST | |
Content Type | text/plain | |
Unexpected fields were specified. An error message lists them. Or no fields were specified. | ||
Status | 500 INTERNAL SERVER ERROR | |
Content Type | text/plain | |
An internal error occurred, often this will indicate a missing database entry. |
Get System Status | ||
---|---|---|
Returns overall master, scheduler, and cluster information, including hardware resources. | ||
GET /status/ | ||
Successful Response | ||
Status | 200 OK | |
Content Type | application/json | |
JSON Fields | ||
master | JSON Object | Overall status information for the master host |
master.hostname | String | The network name of the master host |
master.port | Integer | The network port of the master host |
master.is_online | Boolean | Indicates whether or not the master host is running and available |
scheduler | JSON Object | Overall status information for the scheduler framework |
scheduler.hostname | String | The network name of the scheduler host |
scheduler.is_online | Boolean | Indicates whether or not the scheduler host is running and available |
scheduler.is_paused | Boolean | Indicates whether or not the scheduler framework is currently paused |
queue_depth | Integer | The number of tasks currently scheduled on the queue |
resources | JSON Object | (Optional) Information about the overall hardware resources of the cluster NOTE: Resource information may not always be available |
resources.total | JSON Object | The total hardware resources for all nodes in the cluster |
resources.total.cpus | Float | The total number of CPUs for all nodes in the cluster |
resources.total.mem | Float | The total amount of RAM in MiB for all nodes in the cluster |
resources.total.disk | Float | The total amount of disk space in MiB for all nodes in the cluster |
resources.scheduled | JSON Object | The scheduled hardware resources for all nodes in the cluster |
resources.scheduled.cpus | Float | The scheduled number of CPUs for all nodes in the cluster |
resources.scheduled.mem | Float | The scheduled amount of RAM in MiB for all nodes in the cluster |
resources.scheduled.disk | Float | The scheduled amount of disk space in MiB for all nodes in the cluster |
resources.used | JSON Object | The used hardware resources for all nodes in the cluster NOTE: Real-time resource usage is not currently available and will be all zero |
resources.used.cpus | Float | The used number of CPUs for all nodes in the cluster |
resources.used.mem | Float | The used amount of RAM in MiB for all nodes in the cluster |
resources.used.disk | Float | The used amount of disk space in MiB for all nodes in the cluster |
{
"master": {
"is_online": true,
"hostname": "localhost",
"port": 5050
},
"scheduler": {
"is_online": true,
"is_paused": false,
"hostname": "localhost"
},
"queue_depth": 1234,
"resources": {
"total": {
"cpus": 16.0,
"mem": 63305.0,
"disk": 131485.0
},
"scheduled": {
"cpus": 12.0,
"mem": 35392.0,
"disk": 131408.0
},
"used": {
"cpus": 16.0,
"mem": 63305.0,
"disk": 131485.0
}
}
}
|
Source File Services¶
These services provide access to information about source files that Scale has ingested.
Source File List | |||
---|---|---|---|
Returns a list of all source files | |||
GET /sources/ | |||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
order | String | Optional | One or more fields to use when ordering the results. Duplicate it to multi-sort, (ex: order=file_name&order=created). Nested objects require a delimiter (ex: order=job_type__name). Prefix fields with a dash to reverse the sort, (ex: order=-created). |
is_parsed | Boolean | Optional | Return only sources flagged as successfully parsed. |
file_name | String | Optional | Return only sources with a given file name. |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.id | Integer | The unique identifier of the model. Can be passed to the details API call. (See Source File Details) | |
.workspace | JSON Object | The workspace that has stored the source file. (See Workspace Details) | |
.file_name | String | The name of the source file. | |
.media_type | String | The IANA media type of the source file. | |
.file_size | Integer | The size of the source file in bytes. | |
.data_type | Array | List of strings describing the data type of the source. | |
.is_deleted | Boolean | Whether the source file has been deleted. | |
.uuid | String | A unique identifier that stays stable across multiple job execution runs. | |
.url | URL | The absolute URL to use for downloading the file. | |
.created | ISO-8601 Datetime | When the associated database model was initially created. | |
.deleted | ISO-8601 Datetime | When the source file was deleted. | |
.data_started | ISO-8601 Datetime | When collection of the underlying data file started. | |
.data_ended | ISO-8601 Datetime | When collection of the underlying data file ended. | |
.geometry | WKT String | The full geospatial geometry footprint of the source. | |
.center_point | WKT String | The central geospatial location of the source. | |
.meta_data | JSON Object | A dictionary of key/value pairs that describe source-specific attributes. | |
.countries | Array | A list of zero or more strings with the ISO3 country codes for countries contained in the geographic boundary of this file. | |
.last_modified | ISO-8601 Datetime | When the associated database model was last saved. | |
.is_parsed | Boolean | Whether this source was successfully parsed. | |
.parsed | ISO-8601 Datetime | When the source file was originally parsed by Scale. | |
{
"count": 55,
"next": null,
"previous": null,
"results": [
{
"id": 465,
"workspace": {
"id": 1,
"name": "Raw Source"
},
"file_name": "my_file.kml",
"media_type": "application/vnd.google-earth.kml+xml",
"file_size": 100,
"data_type": [],
"is_deleted": false,
"uuid": "c8928d9183fc99122948e7840ec9a0fd",
"url": "http://host.com/file/path/my_file.kml",
"created": "1970-01-01T00:00:00Z",
"deleted": null,
"data_started": null,
"data_ended": null,
"geometry": null,
"center_point": null,
"meta_data": {...},
"countries": ["TCY", "TCT"],
"last_modified": "1970-01-01T00:00:00Z",
"is_parsed": true,
"parsed": "1970-01-01T00:00:00Z"
},
...
]
}
|
Source File Updates | |||
---|---|---|---|
Returns the source file updates (created, parsed, and deleted sources) that have occurred in the given time range. | |||
GET /sources/updates/ | |||
Query Parameters | |||
page | Integer | Optional | The page of the results to return. Defaults to 1. |
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. |
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). |
order | String | Optional | One or more fields to use when ordering the results. Duplicate it to multi-sort, (ex: order=file_name&order=created). Nested objects require a delimiter (ex: order=job_type__name). Prefix fields with a dash to reverse the sort, (ex: order=-created). |
is_parsed | Boolean | Optional | Return only sources flagged as successfully parsed. |
file_name | String | Optional | Return only sources with a given file name. |
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
count | Integer | The total number of results that match the query parameters. | |
next | URL | A URL to the next page of results. | |
previous | URL | A URL to the previous page of results. | |
results | Array | List of result JSON objects that match the query parameters. | |
.id | Integer | The unique identifier of the model. Can be passed to the details API call. (See Source File Details) | |
.workspace | JSON Object | The workspace that has stored the source file. (See Workspace Details) | |
.file_name | String | The name of the source file. | |
.media_type | String | The IANA media type of the source file. | |
.file_size | Integer | The size of the source file in bytes. | |
.data_type | Array | List of strings describing the data type of the source. | |
.is_deleted | Boolean | Whether the source file has been deleted. | |
.uuid | String | A unique identifier that stays stable across multiple job execution runs. | |
.url | URL | The absolute URL to use for downloading the file. | |
.created | ISO-8601 Datetime | When the associated database model was initially created. | |
.deleted | ISO-8601 Datetime | When the source file was deleted. | |
.data_started | ISO-8601 Datetime | When collection of the underlying data file started. | |
.data_ended | ISO-8601 Datetime | When collection of the underlying data file ended. | |
.geometry | WKT String | The full geospatial geometry footprint of the source. | |
.center_point | WKT String | The central geospatial location of the source. | |
.meta_data | JSON Object | A dictionary of key/value pairs that describe source-specific attributes. | |
.countries | Array | A list of zero or more strings with the ISO3 country codes for countries contained in the geographic boundary of this file. | |
.last_modified | ISO-8601 Datetime | When the associated database model was last saved. | |
.is_parsed | Boolean | Whether this source was successfully parsed. | |
.parsed | ISO-8601 Datetime | When the source file was originally parsed by Scale. | |
.update | JSON Object | Contains the details of this update. | |
..action | String | The source file update that occurred. Choices: [CREATED, PARSED, DELETED]. | |
..when | ISO-8601 Datetime | When the action occurred. | |
{
"count": 55,
"next": null,
"previous": null,
"results": [
{
"id": 465,
"workspace": {
"id": 2,
"name": "Raw Source"
},
"file_name": "my_file.kml",
"media_type": "application/vnd.google-earth.kml+xml",
"file_size": 100,
"data_type": [],
"is_deleted": false,
"uuid": "c8928d9183fc99122948e7840ec9a0fd",
"url": "http://host.com/file/path/my_file.kml",
"created": "1970-01-01T00:00:00Z",
"deleted": null,
"data_started": null,
"data_ended": null,
"geometry": null,
"center_point": null,
"meta_data": {...},
"countries": ["TCY", "TCT"],
"last_modified": "1970-01-01T00:00:00Z",
"is_parsed": true,
"parsed": "1970-01-01T00:00:00Z",
"update": {
"action": "PUBLISHED",
"when": "1970-01-01T00:00:00Z"
}
},
...
]
}
|
Strike Services¶
These services allow a user to create, view, and manage Strike processes.
Create Strike Process | |||
---|---|---|---|
Creates a new Strike process and places it onto the queue | |||
POST /strike/create/ | |||
Content Type | application/json | ||
JSON Fields | |||
name | String | Required | The unique name of the Strike process |
title | String | Optional | A display title for the Strike process |
description | String | Optional | A description for the Strike process |
configuration | JSON Object | Required | JSON defining the Strike configuration, see Strike Configuration Specification Version 1.0 |
{
"name": "my-strike-process",
"title": "My Strike Process",
"description": "This is my Strike process for detecting my favorite files!",
"configuration": {
"version": "1.0",
"mount": "host:/my/path",
"transfer_suffix": "_tmp",
"files_to_ingest": [{
"filename_regex": ".*txt",
"workspace_path": "/my/path",
"workspace_name": "rs"
}]
}
}
|
|||
Successful Response | |||
Status | 200 OK | ||
Content Type | application/json | ||
JSON Fields | |||
strike_id | Integer | The ID of the new Strike process | |
{
"strike_id": 5678
}
|
Trigger Services¶
These services provide access to information about ingest/processing triggers.
TODO
Workspace Services¶
These services provide access to information about workspaces that Scale uses to manage files.
Workspace List | |||||
---|---|---|---|---|---|
Returns a list of all workspaces. | |||||
GET /workspaces/ | |||||
Query Parameters | |||||
page | Integer | Optional | The page of the results to return. Defaults to 1. | ||
page_size | Integer | Optional | The size of the page to use for pagination of results. Defaults to 100, and can be anywhere from 1-1000. | ||
started | ISO-8601 Datetime | Optional | The start of the time range to query. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). | ||
ended | ISO-8601 Datetime | Optional | End of the time range to query, defaults to the current time. Supports the ISO-8601 date/time format, (ex: 2015-01-01T00:00:00Z). Supports the ISO-8601 duration format, (ex: PT3H0M0S). | ||
name | String | Optional | Return only workspaces with a given name. Duplicate it to filter by multiple values. | ||
order | String | Optional | One or more fields to use when ordering the results. Duplicate it to multi-sort, (ex: order=name&order=title). Prefix fields with a dash to reverse the sort, (ex: order=-name). | ||
Successful Response | |||||
Status | 200 OK | ||||
Content Type | application/json | ||||
JSON Fields | |||||
count | Integer | The total number of results that match the query parameters. | |||
next | URL | A URL to the next page of results. | |||
previous | URL | A URL to the previous page of results. | |||
results | Array | List of result JSON objects that match the query parameters. | |||
.id | Integer | The unique identifier of the model. Can be passed to the details API call. (See Workspace Details) | |||
.name | String | The stable name of the workspace used for queries. | |||
.title | String | The human readable display name of the workspace. | |||
.description | String | A longer description of the workspace. | |||
.base_url | String | The URL prefix used to access all files within the workspace. This field can be null if the workspace is not web-accessible. | |||
.is_active | Boolean | Whether the workspace is active (false once workspace is archived). | |||
.used_size | Decimal | The amount of disk space currently being used by the workspace in bytes. This field can be null if the disk space is unknown. | |||
.total_size | Decimal | The total amount of disk space provided by the workspace in bytes. This field can be null if the disk space is unknown. | |||
.created | ISO-8601 Datetime | When the associated database model was initially created. | |||
.archived | ISO-8601 Datetime | When the workspace was archived (no longer active). | |||
.last_modified | ISO-8601 Datetime | When the associated database model was last saved. | |||
{
"count": 5,
"next": null,
"previous": null,
"results": [
{
"id": 2,
"name": "products",
"title": "Products",
"description": "Products Workspace",
"base_url": "http://host.com/products",
"is_active": true,
"used_size": 0,
"total_size": 0,
"created": "2015-10-05T21:26:04.876Z",
"archived": null,
"last_modified": "2015-10-05T21:26:04.876Z"
},
{
"id": 1,
"name": "rs",
"title": "Raw Source",
"description": "Raw Source Workspace",
"base_url": "http://host.com/rs",
"is_active": true,
"used_size": 0,
"total_size": 0,
"created": "2015-10-05T21:26:04.855Z",
"archived": null,
"last_modified": "2015-10-05T21:26:04.855Z"
},
...
]
}
|
Workspace Details | ||
---|---|---|
Returns workspace details | ||
|
||
Successful Response | ||
Status | 200 OK | |
Content Type | application/json | |
JSON Fields | ||
id | Integer | The unique identifier of the model. |
name | String | The stable name of the workspace used for queries. |
title | String | The human readable display name of the workspace. |
description | String | A longer description of the workspace. |
base_url | String | The URL prefix used to access all files within the workspace. This field can be null if the workspace is not web-accessible. |
is_active | Boolean | Whether the workspace is active (false once workspace is archived). |
used_size | Decimal | The amount of disk space currently being used by the workspace in bytes. This field can be null if the disk space is unknown. |
total_size | Decimal | The total amount of disk space provided by the workspace in bytes. This field can be null if the disk space is unknown. |
created | ISO-8601 Datetime | When the associated database model was initially created. |
archived | ISO-8601 Datetime | When the workspace was archived (no longer active). |
last_modified | ISO-8601 Datetime | When the associated database model was last saved. |
json_config | JSON Object | JSON configuration with attributes specific to the type of workspace. (See Workspaces) |
{
"id": 1,
"name": "rs",
"title": "Raw Source",
"description": "Raw Source Workspace",
"base_url": "http://host.com/rs",
"is_active": true,
"used_size": 0,
"total_size": 0,
"created": "2015-10-05T21:26:04.855Z",
"archived": null,
"last_modified": "2015-10-05T21:26:04.855Z"
"json_config": {...}
}
|
Algorithms¶
This document explains the steps necessary to integrate an algorithm into the Scale system. The Scale system utilizes Docker containers to run algorithms in an isolated environment. The first step will be to build a Docker image that encapsulates an algorithm.
To build the Docker image, Docker must be installed on the system and the Docker daemon running. Depending on the linux system, the following packages will need to be installed: docker-io and lxc for Centos6 and docker and lxc for Centos7. Next the Docker daemon service needs to be started by either the service command for Centos6 or the systemctl command for Centos7. (systemctl enable docker; systemctl start docker)
Then create a Dockerfile file to execute the necessary Docker commands to build the image. The following is an example of the format of a Dockerfile:
Dockerfile example
Example removed, we need a good, general example here
Results Manifest¶
The results manifest is a JSON document that defines the output of an algorithm’s run. Using the results manifest, you can specify your outputs, parse information, run_information and errors. In addition, you can register artifacts by printing a line to stdout with the following format “ARTIFACT:<output_name>:<path_to_file>”. The artifact string must be on a separate line, and if there are any conflicts with the manifest file, the manifest file takes precedence.
The following are some example output manifest files:
Results manifest with one output
{
"version": "1.1",
"output_data": [
{
"name" : "output_file",
"file": {
"path" : "/tmp/job_exe_231/outputs/output.csv"
}
}
]
}
The above manifest simply says that the output with the name “output_file” can be found on the local computer at the location “/tmp/job_exe_231/outputs/output.csv”.
Results manifest with a parsed input
{
"version": "1.1",
"parse_results": [
{
"filename" : "myfile.h5",
"data_types" : [
"H5",
"VEG"
],
"geo_metadata": {
"data_started" : "2015-05-15T10:34:12Z",
"data_ended" : "2015-05-15T10:36:12Z",
}
}
]
}
This example is the result of one of the inputs (myfile.h5) being parsed.
Results Manifest Specification Version 1.1¶
A valid results manifest is a JSON document with the following structure:
{
"version": STRING,
"output_data": [
{
"name": STRING,
"file": {
"path": STRING,
"geo_metadata": {
"data_started": STRING(ISO-8601),
"data_ended": STRING(ISO-8601),
"geo_json": JSON
}
},
"files": [
{
"path": STRING,
"geo_metadata": {
"data_started": STRING(ISO-8601),
"data_ended": STRING(ISO-8601),
"geo_json": JSON
}
}
]
}
],
"parse_results": [
{
"filename": STRING,
"new_workspace_path": STRING,
"data_types": [
STRING,
STRING
],
"geo_metadata": {
"data_started": STRING(ISO-8601),
"data_ended": STRING(ISO-8601),
"geo_json": JSON
}
}
],
"info": {}, # TODO: document when completed
"errors": {} # TODO: document when completed
}
version: JSON string
The version is an optional string value that defines the version of the results manifest specification used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an older version and convert it to the current version. The default value for version if it is not included is the latest version, which is currently 1.1. It is recommended, though not required, that you include the version so that future changes to the specification will still accept your results manifest
output_data JSON array
The output_data is an optional array of output files that your algorithm produced. If not provided, it defaults to an empty list. The JSON object that represents each output_data entry has the following fields:
name: JSON string
The name is a required string that indicates which field in the job_interface this output corresponds to.file: JSON object
The file is an optional sting field, however either file or files must be present. The file field should be used if the “file” output_type was used in the job interface. The file object has the following fields:
path: JSON string
The path is the location of the file on the machine that ran the algorithm.
geo_metadata: JSON object
The geo_metadata contains additional geospatial information associated with the output file. It contains the following fields:
data_started: JSON string (ISO-8601)
The data_started is an optional JSON string that is formatted to the ISO-8601 standard. This field represents when the data from this file started.data_ended: JSON string (ISO-8601)
The data_ended is an optional JSON string that is formatted to the ISO-8601 standard. This field represents when the data from this file ended.geo_json: JSON object
The geo_json is an optional JSON string containing the geospatial extents of the data. It is currently required that this contain a 3D geometry. In addition to storing the extents of the data, a center point is auto calculated.files: JSON array
The files is an optional array of JSON objects, however either file or files must be present. The files field should be used if the “files” output_type was used in the job interface. Each files object has the following fields:
path: JSON string
The path is the location of the file on the machine that ran the algorithm.
geo_metadata: JSON object
The geo_metadata contains additional geospatial information associated with the output file. It contains the following fields:
data_started: JSON string (ISO-8601)
The data_started is an optional JSON string that is formatted to the ISO-8601 standard. This field represents when the data from this file started.data_ended: JSON string (ISO-8601)
The data_ended is an optional JSON string that is formatted to the ISO-8601 standard. This field represents when the data from this file ended.geo_json: JSON object
The geo_json is an optional JSON string containing the geospatial extents of the data. It is currently required that this contain a 3D geometry. In addition to storing the extents of the data, a center point is auto calculated.
parse_results: JSON array
The parse_results is an array of JSON objects that contain information from parsing inputs to your algorithm. These results should be used to associate meta-data with input files to the algorithm. Each of the parse results corresponds to a input from the job interface of the type “file”. Additionally, the file must be a “source” file. A “source” file is something that was not produced by an algorithm. Files produced by algorithms are known as “product” files. As an algorithm developer, this is not important, but when you are tying an algorithm to the scale data, this distinction is important. Each parse_results object has the following fields:
filename: JSON string
The filename is a required JSON string that is the name of the file that you have performed the parsing on.new_workspace_path: JSON string
The new_workspace_path is an optional JSON string that is a new location where the file should be stored.data_started: JSON string (ISO-8601)
The data_started is an optional JSON string that is formatted to the ISO-8601 standard. This field represents when the data from this file started.data_ended: JSON string (ISO-8601)
The data_ended is an optional JSON string that is formatted to the ISO-8601 standard. This field represents when the data from this file ended.data_types: JSON array
The data_types is an optional array of JSON strings. Each of the strings is a file data type that this input file can be associated with.gis_data_path: JSON string
The gis_data_path is an optional path to a GeoJSON file. The contents of the this file will be set in the meta_data for the given input file. The geometry will also be set for the file. In addition to storing the extents of the data, a center point is auto calculated.
Results Manifest Specification Version 1.0¶
A valid version 1.0 results manifest is a JSON document with the following structure:
{
"version": STRING,
"files": [
{
"name": STRING,
"path": STRING
},
{
"name": STRING,
"paths": [
STRING,
STRING
]
}
],
"parse_results": [
{
"filename": STRING,
"data_started": STRING(ISO-8601),
"data_ended": STRING(ISO-8601),
"data_types": [
STRING,
STRING
],
"gis_data_path": STRING
}
],
"info": {}, # TODO: document when completed
"errors": {} # TODO: document when completed
}
version: JSON string
The version is an optional string value that defines the version of the results manifest specification used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an older version and convert it to the current version. The default value for version if it is not included is the latest version, which is currently 1.0. It is recommended, though not required, that you include the version so that future changes to the specification will still accept your results manifest
files JSON array
The files is an optional array of output files that your algorithm produced. If not provided, files defaults to an empty list. The JSON object that represents each files entry has the following fields:
name: JSON string
The name is a required string that indicates which field in the job_interface this output corresponds to.path: JSON string
The path is an optional sting field, however either path or paths must be present. The path is the location of the file on the machine that ran the algorithm. The path field should be used if the “file” output_type was used in the job interface.paths: JSON array
The paths is an optional array of JSON strings, however either path or paths must be present. Each string in the array is a path to a file that corresponds to a job_output. The paths field should be used if the “files” output_type was used in the job interface.
parse_results: JSON array
The parse_results is an array of JSON objects that contain information from parsing inputs to your algorithm. These results should be used to associate meta-data with input files to the algorithm. Each of the parse results corresponds to a input from the job interface of the type “file”. Additionally, the file must be a “source” file. A “source” file is something that was not produced by an algorithm. Files produced by algorithms are known as “product” files. As an algorithm developer, this is not important, but when you are tying an algorithm to the scale data, this distinction is important. Each parse_results object has the following fields:
filename: JSON string
The filename is a required JSON string that is the name of the file that you have performed the parsing on.data_started: JSON string (ISO-8601)
The data_started is an optional JSON string that is formatted to the ISO-8601 standard. This field represents when the data from this file started.data_ended: JSON string (ISO-8601)
The data_ended is an optional JSON string that is formatted to the ISO-8601 standard. This field represents when the data from this file ended.data_types: JSON array
The data_types is an optional array of JSON strings. Each of the strings is a file data type that this input file can be associated with.gis_data_path: JSON string
The gis_data_path is an optional path to a GeoJSON file. The contents of the this file will be set in the meta_data for the given input file. The geometry will also be set for the file. In addition to storing the extents of the data, a center point is auto calculated.
Architecture¶
Overview¶

The Scale system is comprised of several major pieces:
Docker Registry
The Docker registry contains the images for each job type that can be run.
Mesos Slaves
Each Mesos slave is a separate node (machine) that registers itself with the Mesos master in order to receive tasking to
run jobs. When a node runs a job, it grabs the appropriate image for the job’s type from the Docker registry and
performs a Docker run command to execute the job.
Mesos Master
The Mesos master keeps track of all nodes (including their available resources) in the cluster and all jobs that have
been scheduled on the nodes. When the master has available resources in the cluster, it offers these resources to the
Scale scheduler.
Scale Scheduler
When the Scale scheduler receives resource offers from the Mesos master, it queries the database to determine the next
job(s) that are available on the queue and able to run within the given resources. The scheduler passes the necessary
information back to the Mesos master so the master can schedule the jobs on the available nodes.
Scale Database
The Scale database is a PostGIS (PostgreSQL with the PostGIS extension) relational database that contains the job and
recipe status and history, meta-data for all source and product files, configuration data for all workspaces, triggers,
and ingest processes, and all historical metrics generated by Scale.
Scale Web Server
The Scale web server provides the RESTful HTTP API and web front-end to any external clients/browsers. The web server
grabs data from both the Scale database and the Mesos RESTful HTTP API provided by the master.
Jobs and Recipes¶
Jobs represent the various algorithms or units of work that get executed in Scale. Recipes represent a graph/workflow of jobs that allow jobs to depend upon one another and for files produced by one job to be fed as input into another job.
Errors Interface¶
The error interface is a JSON document that defines the interface for translating the job’s exit codes to errors.
Consider the following example error definition, which adds one exit code (1) to map to a Scale error (Unknown).
Example error interface:
{
"version": "1.0",
"exit_codes": {
"1": "unknown"
}
}
Error Interface Specification Version 1.0¶
A valid error interface is a JSON document with the following structure:
{
"version": STRING,
"exit_codes": {
STRING: STRING,
STRING: STRING
}
}
version: JSON string
The version is an optional string value that defines the version of the definition specification used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an older version and convert it to the current version. The default value for version if it is not included is the latest version, which is currently 1.0. It is recommended, though not required, that you include the version so that future changes to the specification will still accept the recipe definition.
exit_codes: JSON object
The exit_codes is a required object that defines what exit codes are mapped to the Scale errors in the database. It is a map of strings, which are the algorithm’s exit codes, to strings that are the name values of the errors in the database.
Job Interface¶
The job interface is a JSON document that defines the interface for executing the job’s algorithm. It will describe the algorithm’s inputs and outputs, as well as the command line details for how to invoke the algorithm.
Consider the following example algorithm, called make_geotiff.py. make_geotiff.py is a Python script that takes a PNG image file and a CSV containing georeference information for the PNG. It combines the information from the two files to create a GeoTIFF file, which is an image format that contains georeference information. The job interface for the algorithm could be defined as follows:
Example job interface:
{
"version": "1.0",
"command": "python make_geotiff.py",
"command_arguments": "${image} ${georeference_data} ${job_output_dir}",
"input_data": [
{
"name": "image",
"type": "file",
"media_types": [
"image/png"
]
},
{
"name": "georeference_data",
"type": "file",
"media_types": [
"text/csv"
]
}
],
"output_data": [
{
"name": "geo_image",
"type": "file",
"media_type": "image/tiff"
}
]
}
The command value specifies that the algorithm is executed by invoking Python with the make_geotiff.py script. The command_arguments value describes the command line arguments to pass to the make_geotiff.py script. The image file input is first (this will be the absolute file system path of the file), the georeference_data file input will be next, and finally an output directory is provided for the script to write any output files. The input_data value is a list detailing the inputs to the algorithm; in this case an input called image that is a file with media type image/png and an input called georeference_data which is a CSV file. Finally the output_data value is a list of the algorithm outputs, which is a GeoTIFF file in this instance. To see all of the options for defining a job interface, please refer to the Job Interface Specification below.
Job Interface Specification Version 1.0¶
A valid job interface is a JSON document with the following structure:
{
"version": STRING,
"command": STRING,
"command_arguments": STRING,
"input_data": [
{
"name": STRING,
"type": "property",
"required": true|false
},
{
"name": STRING,
"type": "file",
"required": true|false,
"media_types": [
STRING,
STRING
]
},
{
"name": STRING,
"type": "files",
"required": true|false,
"media_types": [
STRING,
STRING
]
}
],
"output_data": [
{
"name": STRING,
"type": "file",
"required": true|false,
"media_type": STRING
},
{
"name": STRING,
"type": "files",
"required": true|false,
"media_type": STRING
}
]
}
version: JSON string
The version is an optional string value that defines the version of the definition specification used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an older version and convert it to the current version. The default value for version if it is not included is the latest version, which is currently 1.0. It is recommended, though not required, that you include the version so that future changes to the specification will still accept the recipe definition.
Scale must recognize the version number as valid for the recipe to work. Currently, “1.0” is the only valid version.
command: JSON string
The command is a required string value that defines the main command to execute on the command line without any of the command line arguments. Unlike command_arguments, no string substitution will be performed.
command_arguments: JSON string
The command_arguments is a required string value that defines the command line arguments to be passed to the command when it is executed. Although required, command_arguments may be an empty string (i.e. “”). Scale will perform string substitution on special values denoted by the pattern ${...}. You can indicate that an input should be passed on the command line by using ${INPUT NAME}. The value that is substituted depends on the type of the input. If you need the command line argument to be passed with a flag, you can use the following pattern: ${FLAG:INPUT NAME}. There is also a special substitution value ${job_output_dir}, which will be replaced with the absolute file system path of the output directory where the algorithm may write its output files. The algorithm should produce a results manifest named “results_manifest.json”. The format for the results manifest can be found here: Results Manifest. Any output files must be registered in the results manifest.
input_data: JSON array
The input_data is an optional list of JSON objects that define the inputs the algorithm receives to perform its function. If not provided, input_data defaults to an empty list (no inputs). The JSON object that represents each input has the following fields:
name: JSON string
The name is a required string that defines the name of the input. The name of every input and output in the interface must be unique. This name must only be composed of less than 256 of the following characters: alphanumeric, ” ”, “_”, and “-”.required: JSON boolean
The required field is optional and indicates if the input is required for the algorithm to run successfully. If not provided, the required field defaults to true.type: JSON string
The type is a required string from a defined set that defines the type of the input. The input_data JSON object may have additional fields depending on its type. The valid types are:
property
A “property” input is a string that is passed to the algorithm on the command line. When the algorithm is executed, the value of each “property” input will be substituted where its input name is located within the command_arguments string. A “property” input has no additional fields.file
A “file” input is a single file that is provided to the algorithm. When the algorithm is executed, the absolute file system path of each input file will be substituted where its input name is located within the command_arguments string. A “file” input has the following additional fields:
media_types: JSON array
A media_types field on a “file” input is an optional list of strings that designate the required media types for any file being passed in the input. Any file that does not match one of the listed media types will be prevented from being passed to the algorithm. If not provided, the media_types field defaults to an empty list and all media types are accepted for the input.files
A “files” input is a list of one or more files that is provided to the algorithm. When the algorithm is executed, the absolute file system path of a directory containing the list of files will be substituted where its input name is located within the command_arguments string. A “files” input has the following additional fields:
media_types: JSON array
A media_types field on a “files” input is an optional list of strings that designate the required media types for any files being passed in the input. Any file that does not match one of the listed media types will be prevented from being passed to the algorithm. If not provided, the media_types field defaults to an empty list and all media types are accepted for the input.
output_data: JSON array
The output_data is an optional list of JSON objects that define the outputs the algorithm will produce as a result of its successful execution. If not provided, output_data defaults to an empty list (no outputs). The JSON object that represents each output has the following fields:
name: JSON string
The name is a required string that defines the name of the output. The name of every input and output in the interface must be unique. This name must only be composed of less than 256 of the following characters: alphanumeric, ” ”, “_”, and “-”.required: JSON boolean
The required field is optional and indicates if the output is guaranteed to be produced by the algorithm on a successful run. If the algorithm may or may not product an output under normal conditions, the required field should be set to false. If not provided, the required field defaults to true.type: JSON string
The type is a required string from a defined set that defines the type of the output. The output_data JSON object may have additional fields depending on its type. The valid types are:
file
A “file” output is a single file that is produced by the algorithm. A “file” output has the following additional fields:
media_type: JSON string
A media_type field on a “file” output is an optional string defining the media type of the file produced. If not provided, the media type of the file will be determined by Scale using the file extension as guidance.files
A “files” output is a list of one or more files that are produced by the algorithm. A “files” output has the following additional fields:
media_type: JSON string
A media_type field on a “files” output is an optional string defining the media type of each file produced. If not provided, the media type of each file will be determined by Scale using the file extension as guidance.
Job Data¶
The job data is a JSON document that defines the actual data and configuration on which a specific job will run. It will describe all of the data being passed to the job’s inputs, as well as configuration for how to handle the job’s output. The job data is required when placing a specific job on the queue for the first time.
Consider our previous example algorithm, make_geotiff.py, from Job Interface. The job data for queuing and running a make_geotiff.py job could be defined as follows:
Example job data:
{
"version": "1.0",
"input_data": [
{
"name": "image",
"file_id": 1234
},
{
"name": "georeference_data",
"file_id": 1235
}
],
"output_data": [
{
"name": "geo_image",
"workspace_id": 12
}
]
}
The input_data value is a list detailing the data to pass to each input to the job. In this case the input called image that takes a PNG image file is being passed a file from the Scale system that has the unique ID 1234, and the input called georeference_data which takes a CSV file is being passed a Scale file with the ID 1235. The output_data value is a list detailing the configuration for handling the job’s outputs, which in our example is a single GeoTIFF file. The configuration in our example defines that after the GeoTIFF file is produced by the job, it should be stored in the workspace with the unique ID 12. To see all of the options for defining job data, please refer to the Job Data Specification below.
Job Data Specification Version 1.0¶
A valid job data is a JSON document with the following structure:
{
"version": STRING,
"input_data": [
{
"name": STRING,
"value": STRING
},
{
"name": STRING,
"file_id": INTEGER
},
{
"name": STRING,
"file_ids": [
INTEGER,
INTEGER
]
}
],
"output_data": [
{
"name": STRING,
"workspace_id": INTEGER
}
]
}
version: JSON string
The version is an optional string value that defines the version of the data specification used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an older version and convert it to the current version. The default value for version if it is not included is the latest version, which is currently 1.0. It is recommended, though not required, that you include the version so that future changes to the specification will still accept your job data.
input_data: JSON array
The input_data is a list of JSON objects that define the actual data the job receives for its inputs. If not provided, input_data defaults to an empty list (no input data). For the job data to be valid, every required input in the matching job interface must have a corresponding entry in this input_data field. The JSON object that represents each input data has the following fields:
name: JSON string
The name is a required string that gives the name of the input that the data is being provided for. It should match the name of an input in the job’s interface. The name of every input and output in the job data must be unique.The other fields that describe the data being passed to the input are based upon the type of the input as it is defined in the job interface, see Job Interface Specification Version 1.0. The valid types from the job interface specification are:
property
A “property” input has the following additional field:
value: JSON string
The value field contains the string value that will be passed to the “property” input.file
A “file” input has the following additional field:
file_id: JSON number
The required file_id field contains the unique ID of a file in the Scale system that will be passed to the input. The file must meet all of the criteria defined in the job interface for the input.files
A “files” input has the following additional field:
file_ids: JSON array
The required file_ids field is a list of unique IDs of the files in the Scale system that will be passed to the input. Each file must meet all of the criteria defined in the job interface for the input. A “files” input will accept a file_id field instead of a file_ids field (the input will be passed a list containing the single file).
output_data: JSON array
The output_data is a list of JSON objects that define the details for how the job should handle its outputs. If not provided, output_data defaults to an empty list (no output data). For the job data to be valid, every output in the matching job interface must have a corresponding entry in this output_data field. The JSON object that represents each output data has the following fields:
name: JSON string
The name is a required string that gives the name of the input that the data is being provided for. It should match the name of an input in the job’s interface. The name of every input and output in the job data must be unique.The other fields that describe the output configuration are based upon the type of the output as it is defined in the job interface, see Job Interface Specification Version 1.0. The valid types from the job interface specification are:
file
A “file” output has the following additional field:
workspace_id: JSON number
The required workspace_id field contains the unique ID of the workspace in the Scale system that this output file should be stored in after it is produced.files
A “files” output has the following additional field:
workspace_id: JSON number
The required workspace_id field contains the unique ID of the workspace in the Scale system that these output files should be stored in after they are produced.
Recipe Definition¶
A recipe is a collection of jobs that get run together. The recipe definition is a JSON document that defines how the recipe is run. It will describe the recipe’s inputs, the jobs that will be run as part of the recipe, and how the inputs and outputs of those jobs are connected.
Consider the following example algorithms, called make_geotiff.py and detect_points.py. make_geotiff.py is a Python script that takes a PNG image file and a CSV containing georeference information for the PNG. It combines the information from the two files to create a GeoTIFF file, which is an image format that contains georeference information. detect_points.py is a Python script that takes a GeoTIFF image file and creates a GeoJSON file that contains the coordinates of various points of interest found in the GeoTIFF. The job interfaces of the two algorithms could be defined as follows (see Job Interface for how to define job interfaces):
{
"version": "1.0",
"command": "python make_geotiff.py",
"command_arguments": "${image} ${georeference_data} ${job_output_dir}",
"input_data": [
{
"name": "image",
"type": "file",
"media_types": [
"image/png"
]
},
{
"name": "georeference_data",
"type": "file",
"media_types": [
"text/csv"
]
}
],
"output_data": [
{
"name": "geo_image",
"type": "file",
"media_type": "image/tiff"
}
]
}
{
"version": "1.0",
"command": "python detect_points.py",
"command_arguments": "${image} ${job_output_dir}",
"input_data": [
{
"name": "image",
"type": "file",
"media_types": [
"image/tiff"
]
}
],
"output_data": [
{
"name": "geo_image",
"type": "file",
"media_type": "application/vnd.geo+json"
}
]
}
Now we would like to combine those two algorithms into a recipe that runs both jobs. The recipe will take a PNG and a CSV, pass these files to the make_geotiff.py algorithm, and then pass the resulting GeoTIFF file to the detect_points.py algorithm. The recipe definition could be described as follows:
Example recipe definition:
{
"version": "1.0",
"input_data": [
{
"name": "image",
"type": "file",
"media_types": [
"image/png"
]
},
{
"name": "georeference_data",
"type": "file",
"media_types": [
"text/csv"
]
}
],
"jobs": [
{
"name": "make_geotiff",
"job_type": {
"name": "geotiff-maker",
"version": "1.2.3"
},
"recipe_inputs": [
{
"recipe_input": "image",
"job_input": "image"
},
{
"recipe_input": "georeference_data",
"job_input": "georeference_data"
}
]
},
{
"name": "detect_points",
"job_type": {
"name": "point-detector",
"version": "4.5.6"
},
"dependencies": [
{
"name": "make_geotiff",
"connections": [
{
"output": "geo_image",
"input": "image"
}
]
}
]
}
]
}
The input_data value is a list detailing the inputs to the recipe; in this case an input called image that is a file with media type image/png and an input called georeference_data which is a CSV file. These inputs happen to be identical to the inputs of the make_geotiff.py job. The job value is a list of all of the jobs that make up this recipe and how their inputs and outputs are connected with the rest of the recipe. The make_geotiff.py and detect_points.py are both job types that are stored in Scale. The job_type object indicates the type of the job that we want to run within the recipe. The name value defines the name of the job within the recipe (for linking jobs together). The “make_geotiff” job uses the recipe_inputs list to connect the recipe inputs to its job inputs. The recipe inputs happen to have the same name of the “make_geotiff” job inputs in this example, but the names do not need to be the same. The “detect_points” job uses the dependencies list to describe that it depends on the “make_geotiff” job to successfully complete before “detect_points” is put on the queue. The connections list indicates the output “geo_image” from the “make_geotiff” job should be fed to the “image” input of the “detect_points” job. To see all of the options for defining a recipe, please refer to the Recipe Definition Specification below.
Recipe Definition Specification Version 1.0¶
A valid recipe definition is a JSON document with the following structure:
{
"version": STRING,
"input_data": [
{
"name": STRING,
"type": "property",
"required": true|false
},
{
"name": STRING,
"type": "file",
"required": true|false,
"media_types": [
STRING, STRING
]
},
{
"name": STRING,
"type": "files",
"required": true|false,
"media_types": [
STRING, STRING
]
}
],
"jobs": [
{
"name": STRING,
"job_type": {
"name": STRING,
"version": STRING
},
"recipe_inputs": [
{
"recipe_input": STRING,
"job_input": STRING
}
],
"dependencies": [
{
"name": STRING,
"connections": [
{
"output": STRING,
"input": STRING
}
]
}
]
}
]
}
version: JSON string
The version is an optional string value that defines the version of the definition specification used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an older version and convert it to the current version. The default value for version if it is not included is the latest version, which is currently 1.0. It is recommended, though not required, that you include the version so that future changes to the specification will still accept the recipe definition.
Scale must recognize the version number as valid for the recipe to work. Currently, “1.0” is the only valid version.
input_data: JSON array
The input_data is an optional list of JSON objects that define the inputs the recipe receives to run all of its jobs. If not provided, input_data defaults to an empty list (no inputs). The JSON object that represents each input has the following fields:
name: JSON string
The name is a required string that defines the name of the input. The name of every input in the recipe must be unique. This name must only be composed of less than 256 of the following characters: alphanumeric, ” ”, “_”, and “-”.required: JSON boolean
The required field is optional and indicates if the input is required for the recipe to run successfully. If not provided, the required field defaults to true.type: JSON string
The type is a required string from a defined set that defines the type of the input. The input_data JSON object may have additional fields depending on its type. The valid types are:
property
A “property” input is a string that is passed to the recipe. A “property” input has no additional fields.file
A “file” input is a single file that is provided to the recipe. A “file” input has the following additional fields:
media_types: JSON array
A media_types field on a “file” input is an optional list of strings that designate the required media types for any file being passed in the input. Any file that does not match one of the listed media types will be prevented from being passed to the recipe. If not provided, the media_types field defaults to an empty list and all media types are accepted for the input.files
A “files” input is a list of one or more files that is provided to the recipe. A “files” input has the following additional fields:
media_types: JSON array
A media_types field on a “files” input is an optional list of strings that designate the required media types for any files being passed in the input. Any file that does not match one of the listed media types will be prevented from being passed to the recipe. If not provided, the media_types field defaults to an empty list and all media types are accepted for the input.
jobs: JSON array
The jobs value is a required list of JSON objects that define the jobs that will be run as part of the recipe. The JSON object that represents each job has the following fields:
name: JSON string
The name is a required string that defines the name of the job within the recipe. The name of every job in the recipe must be unique. This name must only be composed of less than 256 of the following characters: alphanumeric, ” ”, “_”, and “-”.job_type
The job_type object is a required reference to the job type to run for this place in the recipe. A job type is uniquely identified by the combination of its system name and version.
name: JSON string
The name used by the system to refer to a job, including in database, recipe, or service references.version: JSON string
The specific version of a job to run since a named job could have multiple versions.recipe_inputs: JSON array
The recipe_inputs value is an optional list that specifies the recipe inputs that should be passed to this job’s inputs. If not provided, recipe_inputs defaults to an empty list (no recipe inputs used by this job). The JSON object that represents each connection to a recipe input has the following fields:
recipe_input: JSON string
The recipe_input is a required string that defines the name of the recipe input to pass to the job.job_input: JSON string
The job_input is a required string that defines the name of the job input that the recipe input should be passed to.dependencies: JSON array
The dependencies value is an optional list that specifies the other jobs that this job is dependent on. If not provided, dependencies defaults to an empty list (no dependencies so this job will be queued immediately when the recipe is created).The JSON object that represents each connection to a recipe input has the following fields:
name: JSON string
The name is a required string that provides the name of the job that is being depended upon. The name value must match the name of another job within the recipe definition. Circular job dependencies are invalid.connections: JSON array
The connections value is an optional list that specifies the outputs of the job depended upon that should be passed to this job’s inputs. If not provided, connections defaults to an empty list (no outputs used by this job). The JSON object that represents each connection to a job output has the following fields:
output: JSON string
The output is a required string that defines the name of the output of the depended upon job.input: JSON string
The input is a required string that defines the name of this job’s input that should receive the output from the depended upon job.
Recipe Data¶
The recipe data is a JSON document that defines the actual data and configuration on which a recipe will run. It will describe all of the data being passed to the recipe’s inputs, as well as the workspace for storing the output for all of the recipe’s jobs. The recipe data is required when creating and queuing a recipe.
Consider our previous example recipe definition from Recipe Definition. The recipe data for creating a recipe with that example definition could be defined as follows:
Example recipe data:
{
"version": "1.0",
"input_data": [
{
"name": "image",
"file_id": 1234
},
{
"name": "georeference_data",
"file_id": 1235
}
],
"workspace_id": 12
}
The input_data value is a list detailing the data to pass to each input in the recipe. In this case the input called image that takes a PNG image file is being passed a file from the Scale system that has the unique ID 1234, and the input called georeference_data which takes a CSV file is being passed a Scale file with the ID 1235. The workspace_id value indicates that any files produced by the jobs in the recipe should be stored in the workspace with the unique ID 12. To see all of the options for defining recipe data, please refer to the Recipe Data Specification below.
Recipe Data Specification Version 1.0¶
A valid recipe data is a JSON document with the following structure:
{
"version": STRING,
"input_data": [
{
"name": STRING,
"value": STRING
},
{
"name": STRING,
"file_id": INTEGER
},
{
"name": STRING,
"file_ids": [
INTEGER,
INTEGER
]
}
],
"workspace_id": INTEGER
}
version: JSON string
The version is an optional string value that defines the version of the data specification used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an older version and convert it to the current version. The default value for version if it is not included is the latest version, which is currently 1.0. It is recommended, though not required, that you include the version so that future changes to the specification will still accept your recipe data.
input_data: JSON array
The input_data is a list of JSON objects that define the actual data the recipe receives for its inputs. If not provided, input_data defaults to an empty list (no input data). For the recipe data to be valid, every required input in the matching recipe definition must have a corresponding entry in this input_data field. The JSON object that represents each input data has the following fields:
name: JSON string
The name is a required string that gives the name of the input that the data is being provided for. It should match the name of an input in the recipe’s definition. The name of every input in the recipe data must be unique.The other fields that describe the data being passed to the input are based upon the type of the input as it is defined in the recipe definition, see Recipe Definition Specification Version 1.0. The valid types from the recipe definition specification are:
property
A “property” input has the following additional field:
value: JSON string
The value field contains the string value that will be passed to the “property” input.file
A “file” input has the following additional field:
file_id: JSON number
The required file_id field contains the unique ID of a file in the Scale system that will be passed to the input. The file must meet all of the criteria defined in the recipe definition for the input.files
A “files” input has the following additional field:
file_ids: JSON array
The required file_ids field is a list of unique IDs of the files in the Scale system that will be passed to the input. Each file must meet all of the criteria defined in the recipe definition for the input. A “files” input will accept a file_id field instead of a file_ids field (the input will be passed a list containing the single file).
workspace_id: JSON number
The workspace_id is required if any of the jobs in the recipe produce any output files. The workspace_id value is an integer providing the unique ID of the workspace to use for storing any files produced by the recipe’s jobs.
Import/Export¶
Scale has the ability to export and import certain configuration settings related to executing workflows such as errors, job types, and recipe types. This feature allows a user to build a recipe or job in one installation for testing purposes and then easily migrate it to another installation for production use without having to reconstruct everything using the web application user interface.
Operating this way has a number of advantages. It permits trying things out without affecting the production system. It saves time since wiring up all the recipe connections can be time consuming. It avoids potential errors that could be introduced by manually recreating a fully tested and working configuration. It helps with upgrades since the import process has the ability to handle some types of changes automatically for the user. The import process is different than a traditional database dump/restore in that it includes a lot of logic to ensure that a job type or recipe type cannot be changed in a way that would invalidate its connections/dependencies. It also keeps track of past versions and snapshots the definitions/interfaces that were used at the time each execution took place.
Future versions of the import system may assist the user with additional types of changes by using a wizard-based guide, prompting the user for how certain conflicts should be resolved.
Example configuration export:
{
"version": "1.0",
"errors": [
{
"name": "my-error",
"title": "My Error",
"description": "My error description.",
"category": "DATA"
}
],
"job_types": [
{
"name": "my-job",
"version": "1.0.0",
"title": "My Job",
"description": "My job description.",
"category": "example",
"author_name": null,
"author_url": null,
"is_operational": true,
"icon_code": "f013",
"docker_privileged": false,
"docker_image": null,
"priority": 100,
"timeout": 1800,
"max_tries": 3,
"cpus_required": 1.0,
"mem_required": 64.0,
"disk_out_const_required": 64.0,
"disk_out_mult_required": 0.0,
"interface": {
"version": "1.0",
"command": "my-cmd",
"command_arguments": "${input_file} ${job_output_dir}",
"input_data": [
{
"media_types": [
"image/png"
],
"required": true,
"type": "file",
"name": "input_file"
}
],
"output_data": [
{
"media_type": "image/jpg",
"required": true,
"type": "file",
"name": "my-output-file"
}
],
"shared_resources": []
}
"error_mapping": {
"version": "1.0",
"exit_codes": {
"1": "my-error"
}
},
"trigger_rule": {
"type": "PARSE",
"name": "my-rule",
"configuration": {
"version": "1.0",
"data": {
"workspace_name": "products",
"input_data_name": "input_file"
},
"condition": {
"media_type": "image/png",
"data_types": []
}
}
}
}
],
"recipe_types": [
{
"name": "my-recipe",
"version": "1.0.0",
"title": "My Recipe",
"description": "My recipe description.",
"definition": {
"version": "1.0",
"input_data": [
{
"media_types": [
"image/png"
],
"required": true,
"type": "file",
"name": "input_file"
}
],
"jobs": []
},
"trigger_rule": {
"type": "PARSE",
"name": "my-rule",
"configuration": {
"version": "1.0",
"data": {
"workspace_name": "products",
"input_data_name": "input_file"
},
"condition": {
"media_type": "image/png",
"data_types": []
}
}
}
}
]
}
The errors field is used to define the meaning of any exit codes that a job type may produce at the end of its execution when it detects a known problem. The job_types field lists all of the types of jobs to import, which is the smallest unit of work in Scale. A job type includes basic attributes, as well as all the associated error mappings, command line interface, and trigger rule that kicks off the job as data arrives. The recipe_types field lists all of the types of recipes to import, which is used to build a processing workflow composed of job types to execute under different conditions. Recipes types support sequential and/or parallel processing constructs and therefore can trigger processing as data arrives or other jobs generate products upon completion. To see all of the options for an exported configuration, please refer to the Configuration Specification below.
Import/Export Configuration Specification Version 1.0¶
A valid exported configuration is a JSON document with the following structure:
{
"version": "1.0",
"errors": [
...
],
"job_types": [
...
],
"recipe_types": [
...
]
}
version: JSON string
The version is an optional string value that defines the version of the configuration used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an older version and convert it to the current version. The default value for version if it is not included is the latest version, which is currently 1.0. It is recommended, though not required, that you include the version so that future changes to the specification will still accept your ingest trigger rule configuration.
errors: JSON array
The errors field is optional and contains JSON objects that define attributes required to import a new error or edit an existing error identified by the name attribute.
job_types: JSON array
The job_types field is optional and contains JSON objects that define attributes required to import a new job type or edit an existing job type identified by the combination of the name and version attributes.
recipe_types: JSON array
The recipe_types field is optional and contains JSON objects that define attributes required to import a new recipe type or edit an existing recipe type identified by the combination of the name and version attributes.
Strike¶
Strike is the name of Scale’s directory watching capability. A Strike process monitors a given directory on a Network File System (NFS) for new files and ingests those files into a Scale workspace if they meet certain criteria. When a new Strike process is created, a new job is created and executed to run the new process. A Strike process contains configuration specifying the details of the NFS directory to monitor and how to ingest and store the files that appear in the directory.
Example Strike configuration:
{
"version": "1.0",
"mount": "host:/my/path",
"transfer_suffix": "_tmp",
"files_to_ingest": [
{
"filename_regex": ".*h5",
"data_types": [],
"workspace_path": "/wrksp/path",
"workspace_name": "rs"
}
]
}
The mount field specifies the NFS host and path that should be mounted in order to access the directory to be monitored. The transfer_suffix field defines a suffix that is used on the file names to indicate that they are still transferring and have not yet finished being copied into the monitored directory. The files_to_ingest value is a list detailing the different files to ingest and how to ingest them. The filename_regex field defines a regular expression to check against the names of newly copied files. If the expression matches a newly copied file name in the directory, that file is ingested according to the other fields in the JSON object. The data_types field is list of strings. Any file that matches the corresponding regular expression will have the data type strings “tagged” with the file. The data type tags are used to categorize files and control which jobs and recipes they go to. The workspace_path specifies a relative path within the workspace where each file will be stored. Three additional and dynamically named directories, for the current year, month, and day, will be appended to the workspace_path value automatically by the Scale system when a file is ingested (i.e. workspace_path/YYYY/MM/DD). The workspace_name field is the system name of the unique workspace that should ingest the file. To see all of the options for a Strike process’s configuration, please refer to the Strike Configuration Specification below.
Strike Configuration Specification Version 1.0¶
A valid Strike configuration is a JSON document with the following structure:
{
"version": "1.0",
"mount": STRING,
"transfer_suffix": STRING,
"files_to_ingest": [
{
"filename_regex": STRING,
"data_types": [
STRING,
STRING
],
"workspace_path": STRING,
"workspace_name": STRING
}
]
}
version: JSON string
The version is an optional string value that defines the version of the configuration used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an older version and convert it to the current version. The default value for version if it is not included is the latest version, which is currently 1.0. It is recommended, though not required, that you include the version so that future changes to the specification will still accept your Strike configuration.
mount: JSON string
The mount field is a required string that specifies the NFS host and path that should be mounted in order to access the monitored directory (format is host:/file/path).
transfer_suffix: JSON string
The transfer_suffix field is a required string that defines a suffix that is used on the file names (by the system or process that is transferring files into the directory) to indicate that the files are still transferring and have not yet finished being copied into the monitored directory.
files_to_ingest: JSON array
The files_to_ingest field is a list of JSON objects that define the rules for which files to ingest and how to ingest them. The array must contain at least one item. Each JSON object has the following fields:
filename_regex: JSON string
The filename_regex field is a required string that defines a regular expression to check against the names of newly copied files. When a new file is copied in the monitored directory, each expression is checked against the file name in order of the files_to_ingest array. If an expression matches a newly copied file name in the directory, that file is ingested according to the other fields in the JSON object and all subsequent rules/expressions in the list are ignored.data_types: JSON array
The data_types field is an optional list of strings. Any file that matches the corresponding file name regular expression will have these data type strings “tagged” with the file. If not provided, data_types defaults to an empty array.workspace_path: JSON string
The workspace_path field is a required string that specifies a relative path within the workspace where each file will be stored. Three additional and dynamically named directories, for the current year, month, and day, will be appended to the workspace_path value automatically by the Scale system when a file is ingested (i.e. workspace_path/YYYY/MM/DD).workspace_name: JSON string
The workspace_name field is required and contains the unique system name of the workspace that should store each file that matches the corresponding file name regular expression.
Triggers¶
Scale uses triggers for automatically generating jobs and recipes to execute as new source data enters the system. Rules are configured and when a (trigger) event occurs in Scale that matches an existing trigger rule, the job(s) and/or recipe(s) for the rule are created and placed on the queue. A given trigger event can trigger multiple rules. There are two different types of Scale triggers: ingest triggers and parse triggers.
Ingest Triggers¶
Ingest triggers are triggers that can occur when a source file is ingested into Scale. A trigger event is generated for every file ingest and checked against all ingest trigger rules.
Example ingest trigger configuration:
{
"version": "1.0",
"condition": {
"media_type": "text/plain",
"data_types": [
"foo",
"bar"
]
},
"data": {
"input_data_name": "my_file",
"workspace_name": "my_workspace"
}
}
The condition field is used to define the conditions for when the ingest rule is triggered. The media_type field says that an ingested file must have a media type of text/plain (a plain text file) in order to trigger this rule. The data_types field specifies that the ingested file must also have the data types “foo” and “bar” tagged on it in order to trigger the rule. The data field specifies the information needed to create the applicable job/recipe (whatever the trigger rule is linked to) when the rule is triggered. The input_data_name field defines the input parameter name of the job/recipe that the ingested file should be passed to, and the workspace_name field gives the unique system name of the workspace for storing all of the products generated by the created job/recipe. To see all of the options for an ingest trigger rule’s configuration, please refer to the Ingest Trigger Configuration Specification below.
Ingest Trigger Configuration Specification Version 1.0¶
A valid ingest trigger rule configuration is a JSON document with the following structure:
{
"version": "1.0",
"condition": {
"media_type": STRING,
"data_types": [
STRING,
STRING
]
},
"data": {
"input_data_name": STRING,
"workspace_name": STRING
}
}
version: JSON string
The version is an optional string value that defines the version of the configuration used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an older version and convert it to the current version. The default value for version if it is not included is the latest version, which is currently 1.0. It is recommended, though not required, that you include the version so that future changes to the specification will still accept your ingest trigger rule configuration.
condition: JSON object
The condition field is optional and contains other fields that specify the conditions under which this ingest rule is triggered. If not provided, the rule is triggered by EVERY source file ingest.
media_type: JSON string
The media_type field is an optional string that defines a media type. An ingested file must have the identical media type defined here in order to trigger this rule. If not provided, the field defaults to “” and all file media types are accepted by the rule.data_types: JSON array
The data_types field is an optional list of data type strings. An ingested file must have all of the data types that are listed here tagged to the file in order to trigger this rule. If not provided, the field defaults to [] and no data types are required.
data: JSON object
The data field is required and contains other fields that specify the details for creating the job/recipe linked to this trigger rule.
input_data_name: JSON string
The input_data_name field is a required string that specifies the input parameter name of the triggered job/recipe that the ingested file should be passed to when the job/recipe is created and placed on the queue.workspace_name: JSON string
The workspace_name field is required and contains the unique system name of the workspace that should store the products created by the triggered job/recipe.
Parse Triggers¶
Parse triggers are triggers that can occur when a source file is parsed. This happens when a job completes with a parse_results section in its generated results manifest file, see Results Manifest. A trigger event is generated for every source file parse and checked against all parse trigger rules.
Example parse trigger configuration:
{
"version": "1.0",
"condition": {
"media_type": "text/plain",
"data_types": [
"foo",
"bar"
]
},
"data": {
"input_data_name": "my_file",
"workspace_name": "my_workspace"
}
}
The condition field is used to define the conditions for when the parse rule is triggered. The media_type field says that a parsed file must have a media type of text/plain (a plain text file) in order to trigger this rule. The data_types field specifies that the parsed file must also have the data types “foo” and “bar” tagged on it in order to trigger the rule. The data field specifies the information needed to create the applicable job/recipe (whatever the trigger rule is linked to) when the rule is triggered. The input_data_name field defines the input parameter name of the job/recipe that the parsed file should be passed to, and the workspace_name field gives the unique system name of the workspace for storing all of the products generated by the created job/recipe. To see all of the options for a parse trigger rule’s configuration, please refer to the Parse Trigger Configuration Specification below.
Parse Trigger Configuration Specification Version 1.0¶
A valid parse trigger rule configuration is a JSON document with the following structure:
{
"version": "1.0",
"condition": {
"media_type": STRING,
"data_types": [
STRING,
STRING
]
},
"data": {
"input_data_name": STRING,
"workspace_name": STRING
}
}
version: JSON string
The version is an optional string value that defines the version of the configuration used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an older version and convert it to the current version. The default value for version if it is not included is the latest version, which is currently 1.0. It is recommended, though not required, that you include the version so that future changes to the specification will still accept your parse trigger rule configuration.
condition: JSON object
The condition field is optional and contains other fields that specify the conditions under which this parse rule is triggered. If not provided, the rule is triggered by EVERY source file parse.
media_type: JSON string
The media_type field is an optional string that defines a media type. A parsed file must have the identical media type defined here in order to trigger this rule. If not provided, the field defaults to “” and all file media types are accepted by the rule.data_types: JSON array
The data_types field is an optional list of data type strings. A parsed file must have all of the data types that are listed here tagged to the file in order to trigger this rule. If not provided, the field defaults to [] and no data types are required.
data: JSON object
The data field is required and contains other fields that specify the details for creating the job/recipe linked to this trigger rule.
input_data_name: JSON string
The input_data_name field is a required string that specifies the input parameter name of the triggered job/recipe that the parsed file should be passed to when the job/recipe is created and placed on the queue.workspace_name: JSON string
The workspace_name field is required and contains the unique system name of the workspace that should store the products created by the triggered job/recipe.
Clock Triggers¶
Clock triggers are triggers that can occur on a pre-defined schedule. This happens when a the Scale Clock process fires every minute and looks at what clock trigger rules are due to be executed. A trigger event is generated for every clock tick that exceeds the threshold specified by a clock trigger rule. Each clock rule uses its own custom trigger event that is defined by the specification outlined below. Clock rules are useful for general system maintenance that cannot be associated to a normal event like file parsing. Calculating system metrics/performance or archiving old records are good cases for a clock rule.
Example clock trigger configuration:
{
"version": "1.0",
"event_type": "MY_METRICS",
"schedule": "PT1H0M0S"
}
The event_type field determines the type of event that is triggered and when determining the last time an event was triggered for the rule. The schedule field determines how often the event should be triggered. The schedule value uses the ISO-8601 period format and is interpreted as absolute time within each day. Therefore, in the example above we are specifying the trigger should happen every hour on the hour. If an event is triggered a few minutes after the hour, the next event will still attempt to fire at the top of the next hour, rather than exactly one hour after the previous event in relative time. This makes the system more predictable and avoids events slowly drifting over time.
Also note that the name field of the trigger rule model must match a corresponding clock event processor registration in the clock module. The processor registration determines what function the Scale clock will execute when the rule is due to trigger a new event.
Clock Trigger Configuration Specification Version 1.0¶
A valid clock trigger rule configuration is a JSON document with the following structure:
{
"version": "1.0",
"event_type": STRING,
"schedule": STRING
}
version: JSON string
The version is an optional string value that defines the version of the configuration used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an older version and convert it to the current version. The default value for version if it is not included is the latest version, which is currently 1.0. It is recommended, though not required, that you include the version so that future changes to the specification will still accept your parse trigger rule configuration.
event_type: JSON string
The event_type field is a required string that determines the trigger event associated with the rule. When the clock process checks to see if a rule needs to be triggered it will query for associated events using this type. If the clock determines that the rule does in fact need to trigger, then this type is used to create the new event that is passed to the clock processor function to do the actual work.
schedule: JSON string
The schedule field is a required string that specifies how often the rule should be triggered. The value must follow the ISO-8601 period format, which takes the form of hours, minutes, and seconds to trigger an event. Note that the current Scale clock implementation does not support the optional days portion of the standard and the smallest time slice that it can execute is once every minute. It is also important to note the scheduler interprets the period relative to the start of each day, rather than relative to its last triggered event. That way if a schedule is defined for every hour and one of the executions falls behind by a few minutes, the next event will still attempt to trigger as close to the hour as possible. For example, if we request execution every hour using PT1H0M0S and the last event actually runs at 11:07AM, then the next execution will be attempted at 12:00PM even though that is not a full hour later.
Workspaces¶
A workspace in the Scale system is a location where files are stored (source files, product files, etc). A workspace contains configuration specifying how files are stored into and retrieved from the workspace. Workspaces are configured to use various brokers, which know how to store/retrieve files in different storage systems (e.g. NFS, FTP).
Example workspace configuration:
{
"version": "1.0",
"broker": {
"type": "nfs",
"mount": "host:/my/path"
}
}
The broker value is a JSON object providing the configuration for this workspace’s broker. The type value indicates that the NFS (Network File System) broker should be used for this workspace. The mount field specifies the NFS host and path that should be mounted in order to access the files. To see all of the options for a workspace’s configuration, please refer to the Workspace Configuration Specification below.
Workspace Configuration Specification Version 1.0¶
A valid workspace configuration is a JSON document with the following structure:
{
"version": STRING,
"broker": {
"type": "nfs",
"mount": STRING
}
}
version: JSON string
The version is an optional string value that defines the version of the configuration used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an older version and convert it to the current version. The default value for version if it is not included is the latest version, which is currently 1.0. It is recommended, though not required, that you include the version so that future changes to the specification will still accept your workspace configuration.
broker: JSON object
The broker is a JSON object that defines the broker that the workspace should use for retrieving and storing files. The broker JSON object has the following fields:
type: JSON string
The type is a required string that specifies the type of the broker to use. The other fields that configure the broker are based upon the type of the broker in the type field. The valid broker types are:
nfs
An “nfs” broker utilizes an NFS (Network File System) for file storage. An NFS broker has the following additional field:
mount: JSON string
The mount is a required string that specifies the NFS host and path that should be mounted in order to access the files (format is host:/file/path).
Django/Code Base¶
The Scale source code (this powers the scheduler, database schema, web server, and built-in jobs) is built upon Django, a powerful Python web framework. The Scale system is made up of many Django “apps” that represent logical pieces of the system. Source code documentation for every app is provided below:
Install¶
System Dependencies¶
Scale requires the following to be installed on all masters and slaves
- Apache Mesos 0.21.X
- Docker 1.5.x
- Python 2.7.x
- Virtualenv
- Pip
- python-setuptools
- geos
- postgis2_93 (TODO: is this right?)
Installation on CentOS7
Apache Mesos
- Download the mesos rpm from mesosphere
- rpm -i mesos-0.21.1-1.1.centos701406.x86_64.rpm
Docker
- yum install -y docker
- edit /etc/sysconfig/docker and add any private registries
- systemctl enable docker
- systemctl start docker
When docker is installed a new docker group is added. Whichever user runs the scale3 nodes will need to be aded to the docker group.
Python and virtualenv
- sudo yum install gcc httpd openssl-devel zlib-devel sqlite-devel bzip2-devel -y
- unzip a Python distribution
- from the unzipped directory run:
- ./configure
- sudo make altinstall
- Download the distributions for setuptools, pip, and virtualenv
- For each of the above, unzip the distribution and run ‘’/usr/local/bin/python2.7 setup.py install’‘
Geos and Postgis
- yum install -y geos geos-devel libpqxx gdal-libs proj
Deploy Script¶
Scale provides a deploy script to make the process of deploying scale to a machine easier. To run the deploy script: Unzip the release tar.gz file run scale/scripts/deploy.sh <mesos-master> [<deploy.to.directory> [<scale-user> [<scale-group>]]]
The script will stop any scale or mesos services running, copy the files to the correct location, and retrive the python dependencies in a virtual environment.
The deploy shuts down the following services before being run:
- mesos-master
- mesos-slave
- scale-web
- scale-scheduler
After the script is run, you will need to restart the services manually. This was split to a separate step since the deployed to system could be a master or a slave.
Scale Configuration¶
Local settings for scale are contained in /etc/scale/local_settings.py. A sample of this file can be found in the distribution under /scale/local_settings_SAMPLE_PROD.py
The script /scripts/deployServicesFromTemplates.sh will install and enable all the scale services on a CentOS7. To install/deploy less, modify the script, or manually perform similar steps.
Mesos Configuration¶
Scale uses mesos to assign processing to nodes. You must configure the mesos slave nodes to point to the correct mesos master (or zookeeper). Additionally, you must add docker to the available mesos containerizers. For a mesosphere installation, this can be done with the following commands:
- echo <mesos-master>:5050 > /tmp/master
- sudo mv /tmp/master /etc/mesos-slave
- echo mesos,docker > /tmp/containerizers
- sudo mv /tmp/containerizers /etc/mesos-slave
Also disable zookeeper for the master and slave:
- Modify /etc/default/mesos-master and remove the ZK line
- Modify /etc/default/mesos-slave and remove the ZK line (this file will be empty now)
To perform an installation of scale, you will need to install the necessary system dependencies. After installing the System Dependencies run the Deploy Script to install scale to the correct location
See Scale Configuration for configuration details.
Development¶
Setting up the project
Follow the instructions in scale/README.txt within the projects
Remote Debugging
The scale scheduler will attempt to connect to a pydev remote debugging server if the following conditions are true:
- Debug is on in django settings
- A pydev debug server is running on on your development machine. To start this in Eclipse open the Debug perspective and select “Pydev->Start Debug Server”
- The REMOTE_DEBUG_HOST environment variable is set on the machine you wish to debug. This should be set to the hostname of the machine with pydev debug server running.
- The PYDEV_SRC environment variable is set on the machine you wish to debug. This should be set to a pydev installation. In order to set this environment variable for the default scale service installation, modify /etc/scale/Environment
In addition to setting the PYDEV_SRC and REMOTE_DEBUG_HOST, you must ensure the pydev installation has been modified correctly.
Within your pydev installation, modify pysrc/pydevd_file_utils.py and change the PATHS_FROM_ECLIPSE_TO_PYTHON to match your installation.