diff --git a/.dockerignore b/.dockerignore index eb8ef92aab8d..3ee0ef375bc1 100644 --- a/.dockerignore +++ b/.dockerignore @@ -1,32 +1,11 @@ -node_modules -npm-debug.log -Dockerfile -.dockerignore -build -dist -docs -tests -tox.ini -public -label_studio.egg-info -.git -.github -.vscode -.editorconfig -.gitignore -*.md -!README.md -*.txt -!requirements.txt -*.yml -*.json -*.pem -.python-version - -# shell scripts -update_pypi.sh +# Ignore everything: +** -# misc folders -tmp -etc -my_* +# Except: +!images +!label_studio +!scripts +!tools +!setup.py +!requirements.txt +!README.md \ No newline at end of file diff --git a/README.md b/README.md index 13a8d0840013..ca4a4962e9ea 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # Label Studio · ![GitHub](https://img.shields.io/github/license/heartexlabs/label-studio?logo=heartex) [![Build Status](https://travis-ci.com/heartexlabs/label-studio.svg?branch=master)](https://travis-ci.com/heartexlabs/label-studio) [![codecov](https://codecov.io/gh/heartexlabs/label-studio/branch/master/graph/badge.svg)](https://codecov.io/gh/heartexlabs/label-studio) ![GitHub release](https://img.shields.io/github/v/release/heartexlabs/label-studio?include_prereleases) · :sunny: -[Website](https://labelstud.io/) • [Docs](https://labelstud.io/guide) • [Twitter](https://twitter.com/heartexlabs) • [Join Slack Community ](https://join.slack.com/t/label-studio/shared_invite/zt-cr8b7ygm-6L45z7biEBw4HXa5A2b5pw) +[Website](https://labelstud.io/) • [Docs](https://labelstud.io/guide/) • [Twitter](https://twitter.com/heartexlabs) • [Join Slack Community ](https://join.slack.com/t/label-studio/shared_invite/zt-cr8b7ygm-6L45z7biEBw4HXa5A2b5pw)
diff --git a/docs/public/search.xml b/docs/public/search.xml index 583574d8a8b5..923fbd0fb17a 100644 --- a/docs/public/search.xml +++ b/docs/public/search.xml @@ -18,22 +18,22 @@ - - - /blog/index.html + Label Studio Release Notes 0.5.0 + + /blog/release-050.html - h1 { margin-top: 2.5em !important; } .content { max-width: 1200px !important; width: unset !important; margin: 60px auto 50px auto; padding: 0; } .blog-body { margin-bottom: 100px; } .grid { display: -webkit-box; display: -ms-flexbox; display: flex; -webkit-box-orient: horizontal; -webkit-box-direction: normal; -ms-flex-direction: row; flex-direction: row; -ms-flex-wrap: wrap; flex-wrap: wrap; -webkit-box-align: stretch; -ms-flex-align: stretch; align-items: stretch; padding: 0; } .column { width: 50% !important; } .highlight { } .highlight { border: 2px solid rgba(244, 138, 66, 0.75); } .card { margin: 2em 1em; } .card .image-wrap { transition: linear 0.25s; border-radius: 7px; box-shadow: 0 0 2px rgba(0, 0, 0, 0.3); padding: 5px; opacity: 0.8; } .card .image-wrap:hover { opacity: 1; box-shadow: 0 0 5px rgba(0, 0, 0, 0.3); transition: linear 0.25s; } .card .image-wrap .image { margin: 0 auto; width: 95%; height: 250px; background-size: contain; background-repeat: no-repeat; background-position: center center; } .card .category { cursor: pointer; display: inline-block; color: $green; margin-top: 18px; font-size: 80%; font-weight: 500; letter-spacing: .08em; text-transform: uppercase; } .card .title { margin-top: 0.5em; font-size: 130%; font-weight: bold; color: #555; } .card .desc { float: right; margin-top: 18px; font-size: 80%; font-weight: normal; color: #777; } @media screen and (max-width: 900px) { @media only screen and (max-width: 768px) { .grid { width: auto; margin-left: 0 !important; margin-right: 0 !important; } .column { width: 100% !important; margin: 0 0 !important; -webkit-box-shadow: none !important; box-shadow: none !important; padding: 1rem 1rem !important; } } }]]> + A month in the making, this new release brings a lot of bugfixes, updated documentation, and of course, a set of new features that have been requested.

Label Studio Frontend

Relations labeling

You can create relations between labeled regions. For example, if you put two bounding boxes, you can connect them with a relation. We’ve extended the functionality to include the direction of the relation, and the possibly label the relation. Here is an example config for that:

<View>  <Relations>    <Relation value="Is A" />    <Relation value="Has Function" />    <Relation value="Involved In" />    <Relation value="Related To" />  </Relations>  <Labels name="lbl-1" toName="txt-1">    <Label value="Subject"></Label>    <Label value="Object"></Label>  </Labels>  <Text name="txt-1" value="$text"></Text></View>

Named Entity Recognition performance

NER got an update, nested entities representation is more apparent now, and it’s optimized to support large texts.


Image Segmentation

Initial implementation of the image segmentation using masks. You get two controls, brush with configurable size, and eraser. The output format is RLE implemented by rle-pack library.

There is a new template available that provides more information about the setup.

Changing the labels

Changing the labels of the existing regions is now easy and supported for any of the data types.

Validate labeling before submitting

Simple validation to protect you from empty results. When choices or labels are required you can specify required=true parameter for the or tag.

Labels and Choices now support more markup

That enables you to build more complex interfaces. Here is an example that puts labels into different groups:


<View>  <Choices name="label" toName="audio" required="true" choice="multiple" >    <View style="display: flex; flex-direction: row; padding-left: 2em; padding-right: 2em; margin-bottom: 3em">      <View style="padding: 1em 4em; background: rgba(255,0,0,0.1)">        <Header size="4" value="Speaker Gender" />        <Choice value="Business" />        <Choice value="Politics" />      </View>      <View style="padding: 1em 4em; background: rgba(255,255,0,0.1)">        <Header size="4" value="Speach Type" />        <Choice value="Legible" />        <Choice value="Slurred" />      </View>      <View style="padding: 1em 4em; background: rgba(0,0,255,0.1)">        <Header size="4" value="Additional" />        <Choice value="Echo" />        <Choice value="Noises" />        <Choice value="Music" />      </View>    </View>  </Choices>  <Audio name="audio" value="$url" /></View>

Image Ellipses labeling

A significant contribution from @lrlunin, implementing ellipses labeling for the images, checkout the template.

Misc

<View>  <HyperText><h1>Hello</h1></HyperText></View>

Label Studio Backend

Multiplatform

Support for Windows, MacOSX, Linux with Python 3.5 or greater

Extended import possibilities

There are now several ways on how you can import your tasks for labeling:

On-the-fly labeling config validation

Previously changing a config after importing or labeling tasks could be dangerous because of created tasks/completions invalidation, therefore this was switched off. Now you should not worry about that - labeling config validation is taken on the fly considering the data already created. You can freely change the appearance of your project on setup page and even add new labels - when you modify something crucial, you’ll be alerted about.

Exporting with automatic converters

When finishing your project - go to the export page and choose in between the common export formats valid for your current project configuration.

Connection to running Machine Learning backend

Connecting to a running machine learning backend allows you to retrain your model continually and visually inspect how its predictions behave on tasks. Just specify ML backend URL when launching Label Studio, and start labeling.

Miscellaneous

Docker support

Now Label Studio is also maintained and distributed as Docker container - run one-liner to build your own cloud labeling solution.

Multisession mode

You can launch Label Studio in multisession mode - then each browser session dynamically creates its own project.

]]>
- Label Studio Release Notes 0.5.0 - - /blog/release-050.html + + + /blog/index.html - A month in the making, this new release brings a lot of bugfixes, updated documentation, and of course, a set of new features that have been requested.

Label Studio Frontend

Relations labeling

You can create relations between labeled regions. For example, if you put two bounding boxes, you can connect them with a relation. We’ve extended the functionality to include the direction of the relation, and the possibly label the relation. Here is an example config for that:

<View>  <Relations>    <Relation value="Is A" />    <Relation value="Has Function" />    <Relation value="Involved In" />    <Relation value="Related To" />  </Relations>  <Labels name="lbl-1" toName="txt-1">    <Label value="Subject"></Label>    <Label value="Object"></Label>  </Labels>  <Text name="txt-1" value="$text"></Text></View>

Named Entity Recognition performance

NER got an update, nested entities representation is more apparent now, and it’s optimized to support large texts.


Image Segmentation

Initial implementation of the image segmentation using masks. You get two controls, brush with configurable size, and eraser. The output format is RLE implemented by rle-pack library.

There is a new template available that provides more information about the setup.

Changing the labels

Changing the labels of the existing regions is now easy and supported for any of the data types.

Validate labeling before submitting

Simple validation to protect you from empty results. When choices or labels are required you can specify required=true parameter for the or tag.

Labels and Choices now support more markup

That enables you to build more complex interfaces. Here is an example that puts labels into different groups:


<View>  <Choices name="label" toName="audio" required="true" choice="multiple" >    <View style="display: flex; flex-direction: row; padding-left: 2em; padding-right: 2em; margin-bottom: 3em">      <View style="padding: 1em 4em; background: rgba(255,0,0,0.1)">        <Header size="4" value="Speaker Gender" />        <Choice value="Business" />        <Choice value="Politics" />      </View>      <View style="padding: 1em 4em; background: rgba(255,255,0,0.1)">        <Header size="4" value="Speach Type" />        <Choice value="Legible" />        <Choice value="Slurred" />      </View>      <View style="padding: 1em 4em; background: rgba(0,0,255,0.1)">        <Header size="4" value="Additional" />        <Choice value="Echo" />        <Choice value="Noises" />        <Choice value="Music" />      </View>    </View>  </Choices>  <Audio name="audio" value="$url" /></View>

Image Ellipses labeling

A significant contribution from @lrlunin, implementing ellipses labeling for the images, checkout the template.

Misc

<View>  <HyperText><h1>Hello</h1></HyperText></View>

Label Studio Backend

Multiplatform

Support for Windows, MacOSX, Linux with Python 3.5 or greater

Extended import possibilities

There are now several ways on how you can import your tasks for labeling:

On-the-fly labeling config validation

Previously changing a config after importing or labeling tasks could be dangerous because of created tasks/completions invalidation, therefore this was switched off. Now you should not worry about that - labeling config validation is taken on the fly considering the data already created. You can freely change the appearance of your project on setup page and even add new labels - when you modify something crucial, you’ll be alerted about.

Exporting with automatic converters

When finishing your project - go to the export page and choose in between the common export formats valid for your current project configuration.

Connection to running Machine Learning backend

Connecting to a running machine learning backend allows you to retrain your model continually and visually inspect how its predictions behave on tasks. Just specify ML backend URL when launching Label Studio, and start labeling.

Miscellaneous

Docker support

Now Label Studio is also maintained and distributed as Docker container - run one-liner to build your own cloud labeling solution.

Multisession mode

You can launch Label Studio in multisession mode - then each browser session dynamically creates its own project.

]]>
+ h1 { margin-top: 2.5em !important; } .content { max-width: 1200px !important; width: unset !important; margin: 60px auto 50px auto; padding: 0; } .blog-body { margin-bottom: 100px; } .grid { display: -webkit-box; display: -ms-flexbox; display: flex; -webkit-box-orient: horizontal; -webkit-box-direction: normal; -ms-flex-direction: row; flex-direction: row; -ms-flex-wrap: wrap; flex-wrap: wrap; -webkit-box-align: stretch; -ms-flex-align: stretch; align-items: stretch; padding: 0; } .column { width: 50% !important; } .highlight { border: 2px solid rgba(244, 138, 66, 0.75); } .card { margin: 2em 1em; } .card .image-wrap { transition: linear 0.25s; border-radius: 7px; box-shadow: 0 0 2px rgba(0, 0, 0, 0.3); padding: 5px; opacity: 0.8; } .card .image-wrap:hover { opacity: 1; box-shadow: 0 0 5px rgba(0, 0, 0, 0.3); transition: linear 0.25s; } .card .image-wrap .image { margin: 0 auto; width: 95%; height: 250px; background-size: contain; background-repeat: no-repeat; background-position: center center; } .card .category { cursor: pointer; display: inline-block; color: $green; margin-top: 18px; font-size: 80%; font-weight: 500; letter-spacing: .08em; text-transform: uppercase; } .card .title { margin-top: 0.5em; font-size: 130%; font-weight: bold; color: #555; } .card .desc { float: right; margin-top: 18px; font-size: 80%; font-weight: normal; color: #777; } @media screen and (max-width: 900px) { @media only screen and (max-width: 768px) { .grid { width: auto; margin-left: 0 !important; margin-right: 0 !important; } .column { width: 100% !important; margin: 0 0 !important; -webkit-box-shadow: none !important; box-shadow: none !important; padding: 1rem 1rem !important; } } }]]>
@@ -51,22 +51,33 @@ - Export results - - /guide/completions.html + Label Studio Release Notes 0.7.0 - Cloud Storage Enablement + + /blog/release-070-cloud-storage-enablement.html - Your annotations are stored in raw completion format inside my_project_name/completions directory, one file per labeled task named as task_id.json.

You can optionally convert and export raw completions to a more common format by doing one of the following:

Basic format

The output data is stored in completions - JSON formatted files, one per each completed task saved in project directory in completions folder or in the "output_dir" option The example structure of completion is the following:

{    "completions": [        {            "id": "1001",            "lead_time": 15.053,            "result": [                {                    "from_name": "tag",                    "id": "Dx_aB91ISN",                    "source": "$image",                    "to_name": "img",                    "type": "rectanglelabels",                    "value": {                        "height": 10.458911419423693,                        "rectanglelabels": [                            "Moonwalker"                        ],                        "rotation": 0,                        "width": 12.4,                        "x": 50.8,                        "y": 5.869797225186766                    }                }            ]        }    ],    "data": {        "image": "https://htx-misc.s3.amazonaws.com/opensource/label-studio/examples/images/nick-owuor-astro-nic-visuals-wDifg5xc9Z4-unsplash.jpg"    },    "id": 1,    "predictions": [        {            "created_ago": "3 hours",            "model_version": "model 1",            "result": [                {                    "from_name": "tag",                    "id": "t5sp3TyXPo",                    "source": "$image",                    "to_name": "img",                    "type": "rectanglelabels",                    "value": {                        "height": 11.612284069097889,                        "rectanglelabels": [                            "Moonwalker"                        ],                        "rotation": 0,                        "width": 39.6,                        "x": 13.2,                        "y": 34.702495201535505                    }                }            ]        },        {            "created_ago": "4 hours",            "model_version": "model 2",            "result": [                {                    "from_name": "tag",                    "id": "t5sp3TyXPo",                    "source": "$image",                    "to_name": "img",                    "type": "rectanglelabels",                    "value": {                        "height": 33.61228406909789,                        "rectanglelabels": [                            "Moonwalker"                        ],                        "rotation": 0,                        "width": 39.6,                        "x": 13.2,                        "y": 54.702495201535505                    }                }            ]        }    ]}

completions

That’s where the list of labeling results per one task is stored.

id

Unique completion identifier

lead_time

Time in seconds spent to create this completion

result

Completion result data

id

Unique completion result identifier

from_name

Name of the tag that was used to label region (control tags)

to_name

Name of the object tag that provided the region to be labeled (object tags)

type

Type of the labeling/tag

value

Tag specific value that includes the labeling result details. The exact structure of value depends on the chosen labeling tag.
Explore each tag for more details.

data

Data copied from input task

id

Task identifier

predictions

Machine learning predictions (aka pre-labeling results). Follows the same format as completion, with some additional fields related to machine learning inference:

Export formats

JSON

List of items in raw completion format stored in JSON file

JSON_MIN

List of items where only "from_name", "to_name" values from raw completion format are kept:

{  "image": "https://htx-misc.s3.amazonaws.com/opensource/label-studio/examples/images/nick-owuor-astro-nic-visuals-wDifg5xc9Z4-unsplash.jpg",  "tag": [{    "height": 10.458911419423693,    "rectanglelabels": [        "Moonwalker"    ],    "rotation": 0,    "width": 12.4,    "x": 50.8,    "y": 5.869797225186766  }]}

CSV

Results are stored in comma-separated tabular file with column names specified by "from_name" "to_name" values

TSV

Results are stored in tab-separated tabular file with column names specified by "from_name" "to_name" values

CONLL2003

Popular format used for CoNLL-2003 named entity recognition challenge

COCO

Popular machine learning format used by COCO dataset for object detection and image segmentation tasks

Pascal VOC XML

Popular XML-formatted task data used for object detection and image segmentation tasks

Export using API

You can use an API to request a file with exported results, e.g.

curl http://localhost:8080/api/export?format=JSON > exported_results.tar.gz

The format parameter could be one of available export formats

]]>
+ Just a couple of weeks after our 0.6.0 release, we’re happy to announce a new big release. We’ve started the discussion about the cloud months ago, and as the first step in simplifying the integration, we’re happy to introduce cloud storage connectors, like AWS S3.

We’re also very interested to learn more from you about your ML pipelines, if you’re interested in having a conversation, please ping us on Slack.


Connecting cloud storage

You can configure label studio to synchronize labeling tasks with your s3 or gcp bucket, potentially filtering by a specific prefix or a file extension. Label Studio will take that list and generate pre-signed URLs each time the task is shown to the annotator.


There are several ways how label studio can load the file, either as a URL or as a blob therefore, you can store the list of tasks or the assets themselves and load that.


You can configure it to store the results back to s3/gcp, making Label Studio a part of your data processing pipeline. Read more about the configuration in the docs here.

Frontend package updates

Finally with a lot of work from Andrew there is an implementation of frontend testing. This will make sure that we don’t break things when we introduce new features. Along with that another Important part — improved building and publishing process, configured CI. Now the npm frontend package will be published along with the pip package.

Labeling Paragraphs and Dialogues

Introducing a new object tag called “Paragraphs”. A paragraph is a piece of text with potentially additional metadata like the author and the timestamp. With this tag we’re also experimenting now with an idea of providing predefined layouts. For example to label the dialogue you can use the following config: <Paragraphs name=“conversation” value=“$conv” layout=“dialogue” />


This feature is available in the enterprise version only

Different shapes on the same image

One limitation label studio had was the ability to use only one shape on the same image, for example, you were able to put either bounding boxes or polygons. Now this limitation is waived and you can define different label groups and connect those to the same image.


maxUsages

There are a couple of ways how you can make sure that the annotation is being performed in full. One of these concepts is a required flag, and we’ve created a new one called maxUsages. For some datasets you know how much objects of a particular type there is, therefore you can limit the usage of specific labels.

Bugfixes and Enhancements

]]>
- Frontend library - - /guide/frontend.html + Getting started + + /guide/index.html + + Overview

Label Studio is a self-contained Web application for multi-typed data labeling and exploration. The backend is written in pure Python powered by Flask. The frontend part is a backend-agnostic React + MST app, included as a precompiled script.

Here are the main concepts behind Label Studio’s workflow:

Quickstart

Prerequisites

Label Studio is supported for Python 3.5 or greater, running on Linux, Windows and MacOSX.

Note: for Windows users the default installation may fail to build lxml package. Consider manually installing it from unofficial Windows binaries e.g. if you are running on x64 with Python 3.8, run pip install lxml‑4.5.0‑cp38‑cp38‑win_amd64.whl.

Running with pip

To install Label Studio via pip, you need Python>=3.5 and run:

pip install label-studio

Then launch a new project which stores all labeling data in a local directory my_labeling_project:

label-studio start my_labeling_project --init

The default browser opens automatically at http://localhost:8080.

Running with Docker

Label Studio is also distributed as a docker container. Make sure you have Docker installed on your local machine.

Install and start Label Studio at http://localhost:8080 storing all labeling data in ./my_labeling_project directory:

docker run --rm -p 8080:8080 -v `pwd`/my_labeling_project:/label-studio/my_labeling_project --name label-studio heartexlabs/label-studio:latest

Note: if ./my_labeling_project the folder exists, an exception will be thrown. Please delete this folder or use --force option.
Note: for Windows, you have to modify the volumes paths set by -v option

You can override the default startup command by appending any of available command line arguments:

docker run -p 8080:8080 -v `pwd`/my_project:/label-studio/my_project --name label-studio heartexlabs/label-studio:latest label-studio start my_project --init --force --template image_mixedlabel

If you want to build a local image, run:

docker build -t heartexlabs/label-studio:latest .

Running from source

If you want to use nighty builds, or extend the functionality, consider to download the source code using Git and run Label Studio locally:

git clone https://github.com/heartexlabs/label-studio.gitcd label-studiopython setup.py develop

Then create a new project, it stores all labeling data in a local directory my_labeling_project:

label-studio start my_labeling_project --init

The default browser will open automatically at http://localhost:8080.

Multisession mode

You can start Label Studio in multisession mode - each browser session creates it’s own project with associated session ID as a name.

In order to launch Label Studio in multisession mode and keep all projects in a separate directory session_projects, run

label-studio start-multi-session --root-dir ./session_projects

Command line arguments

You can specify input tasks, project config, machine learning backend and other options via the command line interface. Run label-studio start --help to see all available options.

]]>
+ +
+ + + + + Export results + + /guide/completions.html - Frontend, as its name suggests, is the frontend library based on React and mobx-state-tree, distributed as an NPM package. You can include it in your applications and provide data annotation support to your users. It can be granularly customized and extended.

Its repository is located at https://github.com/heartexlabs/label-studio-frontend

Install

npm install label-studio

CDN

<!-- Theme included stylesheets --><link href="https://unpkg.com/browse/label-studio@0.4.0/build/static/css/main.14acfaa5.css" rel="stylesheet"><!-- Main Label Studio library --><script src="https://unpkg.com/browse/label-studio@0.4.0/build/static/js/main.0249ea16.js"></script>

Quickstart

Instantiate a new Label Studio object with a selector for the div that should become the editor.

<!-- Include Label Studio stylesheet --><link href="https://unpkg.com/label-studio@0.4.0/build/static/css/main.14acfaa5.css" rel="stylesheet"><!-- Create the Label Studio container --><div id="label-studio"></div><!-- Include the Label Studio library --><script src="https://unpkg.com/label-studio@0.4.0/build/static/js/main.0249ea16.js"></script><!-- Initialize Label Studio --><script>  var labelStudio = new LabelStudio('editor', {    config: `      <View>        <Image name="img" value="$image"></Image>        <RectangleLabels name="tag" toName="img">          <Label value="Hello"></Label>          <Label value="World"></Label>          </RectangleLabels>      </View>    `,    interfaces: [      "panel",      "update",      "controls",      "side-column",      "completions:menu",      "completions:add-new",      "completions:delete",      "predictions:menu",    ],    user: {      pk: 1,      firstName: "James",      lastName: "Dean"    },    task: {      completions: [],      predictions: [],      id: 1,      data: {        image: "https://htx-misc.s3.amazonaws.com/opensource/label-studio/examples/images/nick-owuor-astro-nic-visuals-wDifg5xc9Z4-unsplash.jpg"      }    },        onLabelStudioLoad: function(LS) {      var c = LS.completionStore.addCompletion({        userGenerate: true      });      LS.completionStore.selectCompletion(c.id);    }  });</script>

You can use Playground to test out different types of config.

To see all the available options for the initialization of LabelStudio, please check the Reference.

]]>
+ Your annotations are stored in raw completion format inside my_project_name/completions directory, one file per labeled task named as task_id.json.

You can optionally convert and export raw completions to a more common format by doing one of the following:

Basic format

The output data is stored in completions - JSON formatted files, one per each completed task saved in project directory in completions folder or in the "output_dir" option The example structure of completion is the following:

{    "completions": [        {            "id": "1001",            "lead_time": 15.053,            "result": [                {                    "from_name": "tag",                    "id": "Dx_aB91ISN",                    "source": "$image",                    "to_name": "img",                    "type": "rectanglelabels",                    "value": {                        "height": 10.458911419423693,                        "rectanglelabels": [                            "Moonwalker"                        ],                        "rotation": 0,                        "width": 12.4,                        "x": 50.8,                        "y": 5.869797225186766                    }                }            ]        }    ],    "data": {        "image": "https://htx-misc.s3.amazonaws.com/opensource/label-studio/examples/images/nick-owuor-astro-nic-visuals-wDifg5xc9Z4-unsplash.jpg"    },    "id": 1,    "predictions": [        {            "created_ago": "3 hours",            "model_version": "model 1",            "result": [                {                    "from_name": "tag",                    "id": "t5sp3TyXPo",                    "source": "$image",                    "to_name": "img",                    "type": "rectanglelabels",                    "value": {                        "height": 11.612284069097889,                        "rectanglelabels": [                            "Moonwalker"                        ],                        "rotation": 0,                        "width": 39.6,                        "x": 13.2,                        "y": 34.702495201535505                    }                }            ]        },        {            "created_ago": "4 hours",            "model_version": "model 2",            "result": [                {                    "from_name": "tag",                    "id": "t5sp3TyXPo",                    "source": "$image",                    "to_name": "img",                    "type": "rectanglelabels",                    "value": {                        "height": 33.61228406909789,                        "rectanglelabels": [                            "Moonwalker"                        ],                        "rotation": 0,                        "width": 39.6,                        "x": 13.2,                        "y": 54.702495201535505                    }                }            ]        }    ]}

completions

That’s where the list of labeling results per one task is stored.

id

Unique completion identifier

lead_time

Time in seconds spent to create this completion

result

Completion result data

id

Unique completion result identifier

from_name

Name of the tag that was used to label region (control tags)

to_name

Name of the object tag that provided the region to be labeled (object tags)

type

Type of the labeling/tag

value

Tag specific value that includes the labeling result details. The exact structure of value depends on the chosen labeling tag.
Explore each tag for more details.

data

Data copied from input task

id

Task identifier

predictions

Machine learning predictions (aka pre-labeling results). Follows the same format as completion, with some additional fields related to machine learning inference:

Export formats

JSON

List of items in raw completion format stored in JSON file

JSON_MIN

List of items where only "from_name", "to_name" values from raw completion format are kept:

{  "image": "https://htx-misc.s3.amazonaws.com/opensource/label-studio/examples/images/nick-owuor-astro-nic-visuals-wDifg5xc9Z4-unsplash.jpg",  "tag": [{    "height": 10.458911419423693,    "rectanglelabels": [        "Moonwalker"    ],    "rotation": 0,    "width": 12.4,    "x": 50.8,    "y": 5.869797225186766  }]}

CSV

Results are stored in comma-separated tabular file with column names specified by "from_name" "to_name" values

TSV

Results are stored in tab-separated tabular file with column names specified by "from_name" "to_name" values

CONLL2003

Popular format used for CoNLL-2003 named entity recognition challenge

COCO

Popular machine learning format used by COCO dataset for object detection and image segmentation tasks

Pascal VOC XML

Popular XML-formatted task data used for object detection and image segmentation tasks

Export using API

You can use an API to request a file with exported results, e.g.

curl http://localhost:8080/api/export?format=JSON > exported_results.tar.gz

The format parameter could be one of available export formats

]]>
@@ -84,11 +95,11 @@ - Getting started - - /guide/index.html + Frontend library + + /guide/frontend.html - Overview

Label Studio is a self-contained Web application for multi-typed data labeling and exploration. The backend is written in pure Python powered by Flask. The frontend part is a backend-agnostic React + MST app, included as a precompiled script.

Here are the main concepts behind Label Studio’s workflow:

Quickstart

Prerequisites

Label Studio is supported for Python 3.5 or greater, running on Linux, Windows and MacOSX.

Note: for Windows users the default installation may fail to build lxml package. Consider manually installing it from unofficial Windows binaries e.g. if you are running on x64 with Python 3.8, run pip install lxml‑4.5.0‑cp38‑cp38‑win_amd64.whl.

Running with pip

To install Label Studio via pip, you need Python>=3.5 and run:

pip install label-studio

Then launch a new project which stores all labeling data in a local directory my_labeling_project:

label-studio start my_labeling_project --init

The default browser opens automatically at http://localhost:8080.

Running with Docker

Label Studio is also distributed as a docker container. Make sure you have Docker installed on your local machine.

Install and start Label Studio at http://localhost:8080 storing all labeling data in ./my_labeling_project directory:

docker run --rm -p 8080:8080 -v `pwd`/my_labeling_project:/label-studio/my_labeling_project --name label-studio heartexlabs/label-studio:latest

Note: if ./my_labeling_project the folder exists, an exception will be thrown. Please delete this folder or use --force option.
Note: for Windows, you have to modify the volumes paths set by -v option

You can override the default startup command by appending any of available command line arguments:

docker run -p 8080:8080 -v `pwd`/my_project:/label-studio/my_project --name label-studio heartexlabs/label-studio:latest label-studio start my_project --init --force --template image_mixedlabel

If you want to build a local image, run:

docker build -t heartexlabs/label-studio:latest .

Running from source

If you want to use nighty builds, or extend the functionality, consider to download the source code using Git and run Label Studio locally:

git clone https://github.com/heartexlabs/label-studio.gitcd label-studiopython setup.py develop

Then create a new project, it stores all labeling data in a local directory my_labeling_project:

label-studio start my_labeling_project --init

The default browser will open automatically at http://localhost:8080.

Multisession mode

You can start Label Studio in multisession mode - each browser session creates it’s own project with associated session ID as a name.

In order to launch Label Studio in multisession mode and keep all projects in a separate directory session_projects, run

label-studio start-multi-session --root-dir ./session_projects

Command line arguments

You can specify input tasks, project config, machine learning backend and other options via the command line interface. Run label-studio start --help to see all available options.

]]>
+ Frontend, as its name suggests, is the frontend library based on React and mobx-state-tree, distributed as an NPM package. You can include it in your applications and provide data annotation support to your users. It can be granularly customized and extended.

Its repository is located at https://github.com/heartexlabs/label-studio-frontend

Install

npm install label-studio

CDN

<!-- Theme included stylesheets --><link href="https://unpkg.com/browse/label-studio@0.4.0/build/static/css/main.14acfaa5.css" rel="stylesheet"><!-- Main Label Studio library --><script src="https://unpkg.com/browse/label-studio@0.4.0/build/static/js/main.0249ea16.js"></script>

Quickstart

Instantiate a new Label Studio object with a selector for the div that should become the editor.

<!-- Include Label Studio stylesheet --><link href="https://unpkg.com/label-studio@0.4.0/build/static/css/main.14acfaa5.css" rel="stylesheet"><!-- Create the Label Studio container --><div id="label-studio"></div><!-- Include the Label Studio library --><script src="https://unpkg.com/label-studio@0.4.0/build/static/js/main.0249ea16.js"></script><!-- Initialize Label Studio --><script>  var labelStudio = new LabelStudio('editor', {    config: `      <View>        <Image name="img" value="$image"></Image>        <RectangleLabels name="tag" toName="img">          <Label value="Hello"></Label>          <Label value="World"></Label>          </RectangleLabels>      </View>    `,    interfaces: [      "panel",      "update",      "controls",      "side-column",      "completions:menu",      "completions:add-new",      "completions:delete",      "predictions:menu",    ],    user: {      pk: 1,      firstName: "James",      lastName: "Dean"    },    task: {      completions: [],      predictions: [],      id: 1,      data: {        image: "https://htx-misc.s3.amazonaws.com/opensource/label-studio/examples/images/nick-owuor-astro-nic-visuals-wDifg5xc9Z4-unsplash.jpg"      }    },        onLabelStudioLoad: function(LS) {      var c = LS.completionStore.addCompletion({        userGenerate: true      });      LS.completionStore.selectCompletion(c.id);    }  });</script>

You can use Playground to test out different types of config.

To see all the available options for the initialization of LabelStudio, please check the Reference.

]]>
@@ -110,7 +121,7 @@ /guide/ml.html - You can easily connect your favorite machine learning framework with Label Studio by using Heartex SDK.

That gives you the opportunities to use:

Tutorials

Quickstart

Here is a quick example tutorial on how to do that with simple text classification:

  1. Clone repo
    git clone https://github.com/heartexlabs/label-studio
  1. Create new ML backend
    label-studio-ml init my_ml_backend --script label-studio/ml/examples/simple_text_classifier.py
  1. Start ML backend server
    label-studio-ml start my_ml_backend
  1. Run Label Studio connecting it to the running ML backend:
    label-studio start text_classification_project --init --template text_sentiment --ml-backend-url http://localhost:9090

Create your own ML backend

Check examples in label-studio/ml/examples directory.

]]>
+ You can easily connect your favorite machine learning framework with Label Studio Machine Learning SDK.

That gives you the opportunities to use:

Tutorials

Quickstart

Here is a quick example tutorial on how to run the ML backend with a simple text classifier:

  1. Clone repo
    git clone https://github.com/heartexlabs/label-studio
  1. Setup environment
    cd label-studiopip install -e .cd label_studio/ml/examplespip install -r requirements.txt
  1. Create new ML backend
    label-studio-ml init my_ml_backend --script label-studio/ml/examples/simple_text_classifier.py
  1. Start ML backend server
    label-studio-ml start my_ml_backend
  1. Run Label Studio connecting it to the running ML backend:
    label-studio start text_classification_project --init --template text_sentiment --ml-backend-url http://localhost:9090

Create your own ML backend

Check examples in label-studio/ml/examples directory.

]]>
@@ -127,12 +138,34 @@ + + Cloud storages + + /guide/storage.html + + You can integrate the popular cloud storage with Label Studio, collect new tasks uploaded to your buckets, and sync back annotation results to use them in your machine learning pipelines.

Cloud storage type and bucket need to be configured during the start of the server, and further configured during the runtime via UI.

You can configure one or both:

The connection to both storages is synced, so you can see new tasks after uploading them to the bucket without restarting Label Studio.

The parameters like prefix or matching filename regex could be changed any time from the webapp interface.

Amazon S3

To connect your S3 bucket with Label Studio, be sure you have programmatic access enabled. Check this link to learn more how to set up access to your S3 bucket.

Create connection on startup

The following commands launch Label Studio, configure the connection to your S3 bucket, scan for existing tasks, and load them into the labeling app.

Read bucket with JSON-formatted tasks

label-studio start --init --source s3 --source-path my-s3-bucket

Write completions to bucket

label-studio start --init --target s3-completions --target-path my-s3-bucket

Working with Binary Large OBjects (BLOBs)

When you are storing BLOBs in your S3 bucket (like images or audio files), you might want to use then as is, by generating URLs pointing to those objects (e.g. gs://my-s3-bucket/image.jpg)
Label Studio allows you to generate input tasks with corresponding URLs automatically on-the-fly. You can to this either specifying --source-params when launching app:

label-studio start --init --source s3 --source-path my-s3-bucket --source-params "{\"data_key\": \"my-object-tag-$value\", \"use_blob_urls\": true}"

You can leave "data_key" empty (or skip it at all) then LS generates it automatically with the first task key from label config (it’s useful when you have only one object tag exposed).

Optional parameters

You can specify additional parameters with the command line escaped JSON string via --source-params / --target-params or from UI.

prefix

Bucket prefix (typically used to specify internal folder/container)

regex

A regular expression for filtering bucket objects

create_local_copy

If set true, the local copy of the remote storage will be created.

use_blob_urls

Generate task data with URLs pointed to your bucket objects(for resources like jpg, mp3, etc). If not selected, bucket objects will be interpreted as tasks in Label Studio JSON format, one object per task.

Google Cloud Storage

To connect your GCS bucket with Label Studio, be sure you have enabled programmatic access. Check this link to learn more about how to set up access to your GCS bucket.

Create connection on startup

The following commands launch Label Studio, configure the connection to your GCS bucket, scan for existing tasks, and load them into the app for the labeling.

Read bucket with JSON-formatted tasks

label-studio start --init --source gcs --source-path my-gcs-bucket

Write completions to bucket

label-studio start --init --target gcs-completions --source-path my-gcs-bucket

Working with Binary Large OBjects (BLOBs)

When you are storing BLOBs in your GCS bucket (like images or audio files), you might want to use then as is, by generating URLs pointing to those objects (e.g. gs://my-gcs-bucket/image.jpg)
Label Studio allows you to generate input tasks with corresponding URLs automatically on-the-fly. You can to this either specifying --source-params when launching app:

label-studio start --init --source gcs --source-path my-gcs-bucket --source-params "{\"data_key\": \"my-object-tag-$value\", \"use_blob_urls\": true}"

You can leave "data_key" empty (or skip it at all) then LS generates it automatically with the first task key from label config (it’s useful when you have only one object tag exposed).

Optional parameters

You can specify additional parameters with the command line escaped JSON string via --source-params / --target-params or from UI.

prefix

Bucket prefix (typically used to specify internal folder/container)

regex

A regular expression for filtering bucket objects

create_local_copy

If set true, the local copy of the remote storage will be created.

use_blob_urls

Generate task data with URLs pointed to your bucket objects(for resources like jpg, mp3, etc). If not selected, bucket objects will be interpreted as tasks in Label Studio JSON format, one object per task.

]]>
+ +
+ + + + + Import tasks + + /guide/tasks.html + + Basic format

Label Studio expects the JSON-formatted list of tasks as input. Each task is a dictionary-like structure, with some specific keys reserved for internal use:

Note: in case "data" field is missing in imported task object, the whole task body is interpreted as task["data"], i.e. [{"my_key": "my_value"}] will be internally converted to [{"data": {"my_key": "my_value"}}]

Example

Here is an example of a config and tasks list composed of one element, for text classification project:

<View>  <Text name="message" value="$my_text"/>  <Choices name="sentiment_class" toName="message">    <Choice value="Positive"/>    <Choice value="Neutral"/>    <Choice value="Negative"/>  </Choices></View>
[{  # "id" is a reserved field, avoid using it when importing tasks  "id": 123,  # "data" requires to contain "my_text" field defined by labeling config,  # and can optionally include other fields  "data": {    "my_text": "Opossum is great",    "ref_id": 456,    "meta_info": {      "timestamp": "2020-03-09 18:15:28.212882",      "location": "North Pole"    }   },  # completions are the list of annotation results matched labeling config schema  "completions": [{    "result": [{      "from_name": "sentiment_class",      "to_name": "message",      "type": "choices",      "value": {        "choices": ["Positive"]      }    }]  }],  # "predictions" are pretty similar to "completions"   # except that they also include some ML related fields like prediction "score"  "predictions": [{    "result": [{      "from_name": "sentiment_class",      "to_name": "message",      "type": "choices",      "value": {        "choices": ["Neutral"]      }    }],  # score is used for active learning sampling mode    "score": 0.95  }]}]

Import formats

There are a few possible ways to import data files to your labeling project:

The --input-path argument points to a file or a directory where your labeling tasks reside. By default it expects JSON-formatted tasks, but you can also specify all other formats listed bellow by using --input-format option.

JSON

label-studio init --input-path=my_tasks.json

tasks.json contains tasks in a basic Label Studio JSON format

Directory with JSON files

label-studio init --input-path=dir/with/json/files --input-format=json-dir

Instead of putting all tasks into one file, you can split your input data into several tasks.json, and specify the directory path. Each JSON file contains tasks in a basic Label Studio JSON format.

Note: that if you add more files into the directory then you need to restart Label Studio server.

CSV / TSV

When CSV / TSV formatted text file is used, column names are interpreted as task data keys:

my_text,optional_fieldthis is a first task,123this is a second task,456

Note: Currently CSV / TSV files could be imported only in UI.

Plain text

label-studio init --input-path=my_tasks.txt --input-format=text --label-config=config.xml

In a typical scenario, you may use only one input data stream (or in other words only one object tag specified in label config). In this case, you don’t need to use JSON format, but simply write down your values in a plain text file, line by line, e.g.

this is a first taskthis is a second task

Directory with plain text files

label-studio init --input-path=dir/with/text/files --input-format=text-dir --label-config=config.xml

You can split your input data into several plain text files, and specify the directory path. Then Label Studio scans each file line-by-line, creating one task per line. Each plain text file is formatted the same as above.

Directory with image files

label-studio init --input-path=dir/with/images --input-format=image-dir --label-config=config.xml --allow-serving-local-files

WARNING: “–allow-serving-local-files” is intended to use only for locally running instances: avoid using it for remote servers unless you are sure what you’re doing.

You can point to a local directory, which is scanned recursively for image files. Each file is used to create one task. Since Label Studio works only with URLs, a web link is created for each task, pointing to your local directory as follows:

http://<host:port>/data/filename?d=<path/to/the/local/directory>

Supported formats are: .png .jpg .jpeg .tiff .bmp .gif

Directory with audio files

label-studio init --input-path=my/audios/dir --input-format=audio-dir --label-config=config.xml --allow-serving-local-files

WARNING: “–allow-serving-local-files” is intended to use only for locally running instances: avoid using it for remote servers unless you are sure what you’re doing.

You can point to a local directory, which is scanned recursively for audio files. Each file is used to create one task. Since Label Studio works only with URLs, a web link is created for each task, pointing to your local directory as follows:

http://<host:port>/data/filename?d=<path/to/the/local/directory>

Supported formats are: .wav .aiff .mp3 .au .flac

Import using API

Use API to import tasks in Label Studio basic format if for any reason you can’t access either a local filesystem nor Web UI (e.g. if you are creating a data stream)

curl -X POST -H Content-Type:application/json http://localhost:8080/api/import \--data "[{\"my_key\": \"my_value_1\"}, {\"my_key\": \"my_value_2\"}]"

Sampling

You can define the way of how your imported tasks are exposed to annotators. Several options are available. To enable one of them, specify --sampling=<option> as command line option.

sequential

Tasks are ordered ascending by their "id" fields. This is default mode.

uniform

Tasks are sampled with equal probabilities.

prediction-score-min

Task with minimum average prediction score is taken. When this option is set, task["predictions"] list should be presented along with "score" field within each prediction.

prediction-score-max

Task with maximum average prediction score is taken. When this option is set, task["predictions"] list should be presented along with "score" field within each prediction.

]]>
+ +
+ + + /playground/index.html - .content { max-width: none !important; margin-left: 0 !important; padding: 1em 0 0 0; } .validation { margin-top: 1em; margin-left: 1em; color: red; text-transform: capitalize; } .CodeMirror { min-height: 500px !important; } h1 { margin-bottom: 0.5em !important; } h3 { margin: 1em !important; width: unset; height: unset; } iframe { border: 0; margin: 0 !important; } #render-editor { width: 100%; } #editor-wrap { background-color: rgb(252,252,252); padding: 0; margin: 0; display: none; } .preview { padding: 5px; overflow: auto; } .editor-row { display: flex; margin-bottom: 1em; width: 100% !important; } .data-row { display: flex; } .preview-col { width: 60%; flex: 1; background: rgb(252,252,252); } .editor-area { border: 1px solid #f48a4259; } .config-col { color: rgba(0,0,0,.6); background: rgb(252,252,252); margin-right: 2em; width: 40%; } .input-col { width: 49%; padding-right: 2%; } .output-col { width: 49%; } .hidden { display: none !important; } .message { width: 90%; max-width: 1000px; margin: 1em auto 3em auto; } .grid { display: -webkit-box; display: -ms-flexbox; display: flex; -webkit-box-orient: horizontal; -webkit-box-direction: normal; -ms-flex-direction: row; flex-direction: row; -ms-flex-wrap: wrap; flex-wrap: wrap; -webkit-box-align: stretch; -ms-flex-align: stretch; align-items: stretch; padding: 0; } .column { width: 20% !important; } .use-template { font-weight: normal!important; } .use-template:hover { border-bottom: 1px dashed darkorange; } @font-face { font-family: 'Icons'; src: url("/fonts/icons.eot"); src: url("/fonts/icons.eot?#iefix") format('embedded-opentype'), url("/fonts/icons.woff2") format('woff2'), url("/fonts/icons.woff") format('woff'), url("/fonts/icons.ttf") format('truetype'), url("/fonts/icons.svg#icons") format('svg'); font-style: normal; font-weight: normal; font-variant: normal; text-decoration: inherit; text-transform: none; } i.icon { opacity: 0.75; display: inline-block; margin: 0 0.25rem 0 0; width: 1.18em; height: 1em; font-family: 'Icons'; font-style: normal; font-weight: normal; text-decoration: inherit; text-align: center; speak: none; -moz-osx-font-smoothing: grayscale; -webkit-font-smoothing: antialiased; -webkit-backface-visibility: hidden; backface-visibility: hidden; } i.icon:before { background: none !important; } i.icon.sound:before { content: "\f025"; } i.icon.image:before { content: "\f03e"; } i.icon.code:before { content: "\f121"; } i.icon.font:before { content: "\f031"; } i.icon.video:before { content: "\f03d"; } i.icon.share:before { content: "\f064" } i.icon.copy.outline:before { content: "\f0c5" } .share-buttons { float:right; margin: 1.2em 1em 1em 1em; } .share-buttons i { cursor: pointer; opacity: 0.5 !important; color: #f58a48; transition: 0.25s; } .share-buttons i:hover { opacity: 1 !important; transition: 0.25s; } .intro { max-width: 700px; margin: 0 auto; margin-top: 1.5em; }@media screen and (max-width: 900px) {@media only screen and (max-width: 767.98px) { .intro { padding-left: 0; } .grid { width: auto; margin-left: 0 !important; margin-right: 0 !important; } .column { width: 100% !important; margin: 0 0 !important; -webkit-box-shadow: none !important; box-shadow: none !important; padding: 1rem 1rem !important; } .editor-row { flex-direction: column; } .data-row { flex-direction: column; } .preview-col { width: 100%; } .config-col { width: 100%; } .input-col, .output-col { width: 100%; }}

Playground

Start typing in the config, and you can quickly preview the labeling interface. At the bottom of the page, you have live serialization updates of what Label Studio expects as an input and what it gives you as a result of your labeling work.

Label config

Interface preview

   Loading Label Studio, please wait ...

Input preview

...

Output preview

...
]]>
+ .content { max-width: none !important; margin-left: 0 !important; padding: 1em 0 0 0; } .validation { margin-top: 1em; margin-left: 1em; color: red; text-transform: capitalize; } .CodeMirror { min-height: 500px !important; } h1 { margin-bottom: 0.5em !important; } h3 { margin: 1em !important; width: unset; height: unset; } iframe { border: 0; margin: 0 !important; } #render-editor { width: 100%; } #editor-wrap { background-color: rgb(252,252,252); padding: 0; margin: 0; display: none; } .preview { padding: 5px; overflow: auto; } .editor-row { display: flex; margin-bottom: 1em; width: 100% !important; } .data-row { display: flex; } .preview-col { width: 60%; flex: 1; background: rgb(252,252,252); } .editor-area { border: 1px solid #f48a4259; } .config-col { color: rgba(0,0,0,.6); background: rgb(252,252,252); margin-right: 2em; width: 40%; } .input-col { width: 49%; padding-right: 2%; } .output-col { width: 49%; } .hidden { display: none !important; } .message { width: 90%; max-width: 1000px; margin: 1em auto 3em auto; } .grid { display: -webkit-box; display: -ms-flexbox; display: flex; -webkit-box-orient: horizontal; -webkit-box-direction: normal; -ms-flex-direction: row; flex-direction: row; -ms-flex-wrap: wrap; flex-wrap: wrap; -webkit-box-align: stretch; -ms-flex-align: stretch; align-items: stretch; padding: 0; } .column { width: 20% !important; } .use-template { font-weight: normal!important; } .use-template:hover { border-bottom: 1px dashed darkorange; } @font-face { font-family: 'Icons'; src: url("/fonts/icons.eot"); src: url("/fonts/icons.eot?#iefix") format('embedded-opentype'), url("/fonts/icons.woff2") format('woff2'), url("/fonts/icons.woff") format('woff'), url("/fonts/icons.ttf") format('truetype'), url("/fonts/icons.svg#icons") format('svg'); font-style: normal; font-weight: normal; font-variant: normal; text-decoration: inherit; text-transform: none; } i.icon { opacity: 0.75; display: inline-block; margin: 0 0.25rem 0 0; width: 1.18em; height: 1em; font-family: 'Icons'; font-style: normal; font-weight: normal; text-decoration: inherit; text-align: center; speak: none; -moz-osx-font-smoothing: grayscale; -webkit-font-smoothing: antialiased; -webkit-backface-visibility: hidden; backface-visibility: hidden; } i.icon:before { background: none !important; } i.icon.sound:before { content: "\f025"; } i.icon.image:before { content: "\f03e"; } i.icon.code:before { content: "\f121"; } i.icon.font:before { content: "\f031"; } i.icon.video:before { content: "\f03d"; } i.icon.share:before { content: "\f064" } i.icon.copy.outline:before { content: "\f0c5" } i.icon.archive:before { content: "\f187"; } i.icon.eye:before { content: "\f06e"; } i.icon.bullseye:before { content: "\f140"; } i.icon.vector.square:before { content: "\f5cb"; } .share-buttons { float:right; margin: 1.2em 1em 1em 1em; } .share-buttons i { cursor: pointer; opacity: 0.5 !important; color: #f58a48; transition: 0.25s; } .share-buttons i:hover { opacity: 1 !important; transition: 0.25s; } .intro { max-width: 700px; margin: 0 auto; margin-top: 1.5em; }@media screen and (max-width: 900px) {@media only screen and (max-width: 767.98px) { .intro { padding-left: 0; } .grid { width: auto; margin-left: 0 !important; margin-right: 0 !important; } .column { width: 100% !important; margin: 0 0 !important; -webkit-box-shadow: none !important; box-shadow: none !important; padding: 1rem 1rem !important; } .editor-row { flex-direction: column; } .data-row { flex-direction: column; } .preview-col { width: 100%; } .config-col { width: 100%; } .input-col, .output-col { width: 100%; }}

Playground

Start typing in the config, and you can quickly preview the labeling interface. At the bottom of the page, you have live serialization updates of what Label Studio expects as an input and what it gives you as a result of your labeling work.

Label config

Interface preview

   Loading Label Studio, please wait ...

Input preview

...

Output preview

...
]]>
@@ -194,11 +227,11 @@ - EllipseLabels - - /tags/ellipselabels.html + Ellipse + + /tags/ellipse.html - EllipseLabels tag creates labeled ellipses. Used to create an ellipse on the image

Parameters

ParamTypeDefaultDescription
namestringname of the element
toNamestringname of the image to label
[opacity]float0.6opacity of rectangle
[fillColor]stringellipse fill color, default is transparent
[strokeColor]stringstroke color
[strokeWidth]number1width of stroke
[canRotate]booleantrueshow or hide rotation handle

Example

<View>  <EllipseLabels name="labels" toName="image">    <Label value="Person" />    <Label value="Animal" />  </EllipseLabels>  <Image name="image" value="$image" /></View>
]]>
+ Ellipse is used to add ellipse (elleptic Bounding Box) to an image

Parameters

ParamTypeDefaultDescription
namestringname of the element
toNamestringname of the image to label
[opacity]float0.6opacity of ellipse
[fillColor]stringrectangle fill color, default is transparent
[strokeColor]string"#f48a42"stroke color
[strokeWidth]number1width of the stroke
[canRotate]booleantrueshow or hide rotation handle

Example

<View>  <Ellipse name="ellipse1-1" toName="img-1" />  <Image name="img-1" value="$img" /></View>
]]>
@@ -215,6 +248,17 @@ + + EllipseLabels + + /tags/ellipselabels.html + + EllipseLabels tag creates labeled ellipses. Used to create an ellipse on the image

Parameters

ParamTypeDefaultDescription
namestringname of the element
toNamestringname of the image to label
[opacity]float0.6opacity of rectangle
[fillColor]stringellipse fill color, default is transparent
[strokeColor]stringstroke color
[strokeWidth]number1width of stroke
[canRotate]booleantrueshow or hide rotation handle

Example

<View>  <EllipseLabels name="labels" toName="image">    <Label value="Person" />    <Label value="Animal" />  </EllipseLabels>  <Image name="image" value="$image" /></View>
]]>
+ +
+ + + Header @@ -303,17 +347,6 @@ - - Import tasks - - /guide/tasks.html - - Basic format

Label Studio expects the JSON-formatted list of tasks as input. Each task is a dictionary-like structure, with some specific keys reserved for internal use:

  • data - task body is represented as a dictionary {"key": "value"}. It is possible to store any number of key-value pairs within task data, but there should be source keys defined by label config (i.e. what is defined by object tag’s attribute value="$key").
    Depending on the object tag type, field values are interpreted differently:
    • <Text value="$key">: value is taken as plain text
    • <HyperText value="$key">: value is a HTML markup
    • <HyperText value="$key" encoding="base64">: value is a base64 encoded HTML markup
    • <Audio value="$key">: value is taken as a valid URL to audio file
    • <AudioPlus value="$key">: value is taken as a valid URL to an audio file with CORS policy enabled on the server side
    • <Image value="$key">: value is a valid URL to an image file
  • (optional) id - integer task ID
  • (optional) completions - list of output annotation results, where each result is saved using Label Studio’s completion format. You can import annotation results in order to use them in consequent labeling task.
  • (optional) predictions - list of model prediction results, where each result is saved using Label Studio’s prediction format. Importing predictions is useful for automatic task prelabeling & active learning & exploration.

Note: in case "data" field is missing in imported task object, the whole task body is interpreted as task["data"], i.e. [{"my_key": "my_value"}] will be internally converted to [{"data": {"my_key": "my_value"}}]

Example

Here is an example of a config and tasks list composed of one element, for text classification project:

<View>  <Text name="message" value="$my_text"/>  <Choices name="sentiment_class" toName="message">    <Choice value="Positive"/>    <Choice value="Neutral"/>    <Choice value="Negative"/>  </Choices></View>
[{  # "id" is a reserved field, avoid using it when importing tasks  "id": 123,  # "data" requires to contain "my_text" field defined by labeling config,  # and can optionally include other fields  "data": {    "my_text": "Opossum is great",    "ref_id": 456,    "meta_info": {      "timestamp": "2020-03-09 18:15:28.212882",      "location": "North Pole"    }   },  # completions are the list of annotation results matched labeling config schema  "completions": [{    "result": [{      "from_name": "sentiment_class",      "to_name": "message",      "type": "choices",      "value": {        "choices": ["Positive"]      }    }]  }],  # "predictions" are pretty similar to "completions"   # except that they also include some ML related fields like prediction "score"  "predictions": [{    "result": [{      "from_name": "sentiment_class",      "to_name": "message",      "type": "choices",      "value": {        "choices": ["Neutral"]      }    }],    "score": 0.95  }]}]

Import formats

There are a few possible ways to import data files to your labeling project:

  • Start Label Studio without specifying input path and then import through the web interfaces available at http://127.0.0.1:8080/import

  • Initialize Label Studio project and directly specify the paths, e.g. label-studio init --input-path my_tasks.json --input-format json

The --input-path argument points to a file or a directory where your labeling tasks reside. By default it expects JSON-formatted tasks, but you can also specify all other formats listed bellow by using --input-format option.

JSON

label-studio init --input-path=my_tasks.json

tasks.json contains tasks in a basic Label Studio JSON format

Directory with JSON files

label-studio init --input-path=dir/with/json/files --input-format=json-dir

Instead of putting all tasks into one file, you can split your input data into several tasks.json, and specify the directory path. Each JSON file contains tasks in a basic Label Studio JSON format.

Note: that if you add more files into the directory then you need to restart Label Studio server.

CSV / TSV

When CSV / TSV formatted text file is used, column names are interpreted as task data keys:

my_text,optional_fieldthis is a first task,123this is a second task,456

Note: Currently CSV / TSV files could be imported only in UI.

Plain text

label-studio init --input-path=my_tasks.txt --input-format=text --label-config=config.xml

In a typical scenario, you may use only one input data stream (or in other words only one object tag specified in label config). In this case, you don’t need to use JSON format, but simply write down your values in a plain text file, line by line, e.g.

this is a first taskthis is a second task

Directory with plain text files

label-studio init --input-path=dir/with/text/files --input-format=text-dir --label-config=config.xml

You can split your input data into several plain text files, and specify the directory path. Then Label Studio scans each file line-by-line, creating one task per line. Each plain text file is formatted the same as above.

Directory with image files

label-studio init --input-path=dir/with/images --input-format=image-dir --label-config=config.xml

You can point to a local directory, which is scanned recursively for image files. Each file is used to create one task. Since Label Studio works only with URLs, a web link is created for each task, pointing to your local directory as follows:

http://<host:port>/static/filename?d=<path/to/the/local/directory>

Supported formats are: .png .jpg .jpeg .tiff .bmp .gif

Directory with audio files

label-studio init --input-path=my/audios/dir --input-format=audio-dir --label-config=config.xml

You can point to a local directory, which is scanned recursively for audio files. Each file is used to create one task. Since Label Studio works only with URLs, a web link is created for each task, pointing to your local directory as follows:

http://<host:port>/static/filename?d=<path/to/the/local/directory>

Supported formats are: .wav .aiff .mp3 .au .flac

Import using API

Use API to import tasks in Label Studio basic format if for any reason you can’t access either a local filesystem nor Web UI (e.g. if you are creating a data stream)

curl -X POST -H Content-Type:application/json http://localhost:8080/api/import \--data "[{\"my_key\": \"my_value_1\"}, {\"my_key\": \"my_value_2\"}]"
]]>
- -
- - - Labels @@ -348,22 +381,22 @@ - PolygonLabels - - /tags/polygonlabels.html + Polygon + + /tags/polygon.html - PolygonLabels tag, create labeled polygons

Parameters

ParamTypeDefaultDescription
namestringname of tag
toNamestringname of image to label
[opacity]number0.6opacity of polygon
[fillColor]stringrectangle fill color, default is transparent
[strokeColor]stringstroke color
[strokeWidth]number1width of stroke
[pointSize]small | medium | largemediumsize of polygon handle points
[pointStyle]rectangle | circlerectanglestyle of points

Example

<View>  <Image name="image" value="$image" />  <PolygonLabels name="lables" toName="image">    <Label value="Car" />    <Label value="Sign" />  </PolygonLabels></View>
]]>
+ Polygon is used to add polygons to an image. Just start to click on the image.

Parameters

ParamTypeDefaultDescription
namestringname of tag
tonamestringname of image to label
[opacity]number0.6opacity of polygon
[fillColor]stringrectangle fill color, default is transparent
[strokeColor]stringstroke color
[strokeWidth]number1width of stroke
[pointSize]small | medium | largemediumsize of polygon handle points
[pointStyle]rectangle | circlecirclestyle of points

Example

<View>  <Polygon name="rect-1" toName="img-1" />  <Image name="img-1" value="$img" /></View>
]]>
- Polygon - - /tags/polygon.html + PolygonLabels + + /tags/polygonlabels.html - Polygon is used to add polygons to an image. Just start to click on the image.

Parameters

ParamTypeDefaultDescription
namestringname of tag
tonamestringname of image to label
[opacity]number0.6opacity of polygon
[fillColor]stringrectangle fill color, default is transparent
[strokeColor]stringstroke color
[strokeWidth]number1width of stroke
[pointSize]small | medium | largemediumsize of polygon handle points
[pointStyle]rectangle | circlecirclestyle of points

Example

<View>  <Polygon name="rect-1" toName="img-1" />  <Image name="img-1" value="$img" /></View>
]]>
+ PolygonLabels tag, create labeled polygons

Parameters

ParamTypeDefaultDescription
namestringname of tag
toNamestringname of image to label
[opacity]number0.6opacity of polygon
[fillColor]stringrectangle fill color, default is transparent
[strokeColor]stringstroke color
[strokeWidth]number1width of stroke
[pointSize]small | medium | largemediumsize of polygon handle points
[pointStyle]rectangle | circlerectanglestyle of points

Example

<View>  <Image name="image" value="$image" />  <PolygonLabels name="lables" toName="image">    <Label value="Car" />    <Label value="Sign" />  </PolygonLabels></View>
]]>
@@ -403,22 +436,22 @@ - Relation - - /tags/relation.html + RectangleLabels + + /tags/rectanglelabels.html - Relation tag represents a single relation label

Parameters

ParamTypeDescription
valuestringvalue of the relation
[background]stringbackground color of active label

Example

<View>  <Relations>    <Relation value="Name 1" />    <Relation value="Name 2" />  </Relations></View>
]]>
+ RectangleLabels tag creates labeled rectangles on the image

Parameters

ParamTypeDefaultDescription
namestringname of the element
toNamestringname of the image to label
[opacity]float0.6opacity of rectangle
[fillColor]stringrectangle fill color, default is transparent
[strokeColor]stringstroke color
[strokeWidth]number1width of stroke
[canRotate]booleantrueshow or hide rotation handle

Example

<View>  <RectangleLabels name="labels" toName="image">    <Label value="Person" />    <Label value="Animal" />  </RectangleLabels>  <Image name="image" value="$image" /></View>
]]>
- RectangleLabels - - /tags/rectanglelabels.html + Relation + + /tags/relation.html - RectangleLabels tag creates labeled rectangles on the image

Parameters

ParamTypeDefaultDescription
namestringname of the element
toNamestringname of the image to label
[opacity]float0.6opacity of rectangle
[fillColor]stringrectangle fill color, default is transparent
[strokeColor]stringstroke color
[strokeWidth]number1width of stroke
[canRotate]booleantrueshow or hide rotation handle

Example

<View>  <RectangleLabels name="labels" toName="image">    <Label value="Person" />    <Label value="Animal" />  </RectangleLabels>  <Image name="image" value="$image" /></View>
]]>
+ Relation tag represents a single relation label

Parameters

ParamTypeDescription
valuestringvalue of the relation
[background]stringbackground color of active label

Example

<View>  <Relations>    <Relation value="Name 1" />    <Relation value="Name 2" />  </Relations></View>
]]>
@@ -458,22 +491,22 @@ - Text - - /tags/text.html + Table + + /tags/table.html - Text tag shows an Text markup that can be labeled

Parameters

ParamTypeDefaultDescription
namestringname of the element
valuestringvalue of the element
[selectionEnabled]booleantrueenable or disable selection
[highlightColor]stringhex string with highlight color, if not provided uses the labels color
[granularity]symbol | wordsymbolcontrol per symbol or word selection
[showLabels]booleantrueshow labels next to the region
[encoding]string"stringbase64"

Example

<Text name="text-1" value="$text" granularity="symbol" highlightColor="#ff0000" />
]]>
+ Table tag, show object keys and values in a table

Parameters

  • value [string]

Examples

<View>  <Table name="text-1" value="$text"></Table></View>
]]>
- Table - - /tags/table.html + Text + + /tags/text.html - Table tag, show object keys and values in a table

Parameters

  • value [string]

Examples

<View>  <Table name="text-1" value="$text"></Table></View>
]]>
+ Text tag shows an Text markup that can be labeled

Parameters

ParamTypeDefaultDescription
namestringname of the element
valuestringvalue of the element
[selectionEnabled]booleantrueenable or disable selection
[highlightColor]stringhex string with highlight color, if not provided uses the labels color
[granularity]symbol | wordsymbolcontrol per symbol or word selection
[showLabels]booleantrueshow labels next to the region
[encoding]string"stringbase64"

Example

<Text name="text-1" value="$text" granularity="symbol" highlightColor="#ff0000" />
]]>
@@ -677,17 +710,6 @@ - - Ellipse - - /tags/ellipse.html - - Ellipse is used to add ellipse (elleptic Bounding Box) to an image

Parameters

ParamTypeDefaultDescription
namestringname of the element
toNamestringname of the image to label
[opacity]float0.6opacity of ellipse
[fillColor]stringrectangle fill color, default is transparent
[strokeColor]string"#f48a42"stroke color
[strokeWidth]number1width of the stroke
[canRotate]booleantrueshow or hide rotation handle

Example

<View>  <Ellipse name="ellipse1-1" toName="img-1" />  <Image name="img-1" value="$img" /></View>
]]>
- -
- - - diff --git a/docs/source/guide/completions.md b/docs/source/guide/export.md similarity index 99% rename from docs/source/guide/completions.md rename to docs/source/guide/export.md index 3d0fa3b41649..3e365433c332 100644 --- a/docs/source/guide/completions.md +++ b/docs/source/guide/export.md @@ -1,7 +1,7 @@ --- title: Export results type: guide -order: 104 +order: 105 --- Your annotations are stored in [raw completion format](#Completion-format) inside `my_project_name/completions` directory, one file per labeled task named as `task_id.json`. diff --git a/docs/source/guide/labeling.md b/docs/source/guide/labeling.md index af79c2cb62bd..c1d43a1e338e 100644 --- a/docs/source/guide/labeling.md +++ b/docs/source/guide/labeling.md @@ -1,7 +1,7 @@ --- title: Labeling type: guide -order: 103 +order: 104 --- Let's explore the complex example of multi-task labeling which includes text + image + audio data objects: diff --git a/docs/source/guide/ml.md b/docs/source/guide/ml.md index b04f057074af..b5dcf65fc22f 100644 --- a/docs/source/guide/ml.md +++ b/docs/source/guide/ml.md @@ -52,7 +52,36 @@ Here is a quick example tutorial on how to run the ML backend with a simple text label-studio start text_classification_project --init --template text_sentiment --ml-backend-url http://localhost:9090 ``` +## Start with docker compose +Label Studio ML scripts include everything you need to create production ready ML backend server, powered by docker. It uses [uWSGI](https://uwsgi-docs.readthedocs.io/en/latest/) + [supervisord](http://supervisord.org/) stack, and handles background training jobs using [RQ](https://python-rq.org/). + +After running this command: + +```bash +label-studio-ml init my-ml-backend --script label_studio/ml/examples/simple_text_classifier.py +``` + +you'll see configs in `my-ml-backend/` directory needed to build and run docker image using docker-compose. + +Some preliminaries: + +1. Ensure all requirements are specified in `my-ml-backend/requirements.txt` file, e.g. place + + ```requirements.txt + scikit-learn + ``` + +2. There are no services currently running on ports 9090, 6379 (otherwise change default ports in `my-ml-backend/docker-compose.yml`) + +Then from `my-ml-backend/` directory run +```bash +docker-compose up +``` + +The server starts listening on port 9090, and you can connect it to Label Studio by specifying `--ml-backend http://localhost:9090` + or via UI on **Model** page. + ## Create your own ML backend Check examples in `label-studio/ml/examples` directory. \ No newline at end of file diff --git a/docs/source/guide/setup.md b/docs/source/guide/setup.md index 67c290351216..6192d36c6361 100644 --- a/docs/source/guide/setup.md +++ b/docs/source/guide/setup.md @@ -1,7 +1,7 @@ --- title: Project setup type: guide -order: 102 +order: 101 --- **Project** is a directory where all annotation assets are located. It is a self-contained entity: when you start Label Studio for the first time e.g. `label-studio start ./my_project --init`, diff --git a/docs/source/guide/storage.md b/docs/source/guide/storage.md index ac606b85834c..d3f1833ec490 100644 --- a/docs/source/guide/storage.md +++ b/docs/source/guide/storage.md @@ -1,12 +1,12 @@ --- title: Cloud storages type: guide -order: 101 +order: 103 --- You can integrate the popular cloud storage with Label Studio, collect new tasks uploaded to your buckets, and sync back annotation results to use them in your machine learning pipelines. -Cloud storage type and bucket need to be configured during the start of the server, and further configured during the runtime via UI. +You can configure storage type, bucket and prefixes during the start of the server or during the runtime via UI on **Tasks** page. You can configure one or both: @@ -17,6 +17,8 @@ The connection to both storages is synced, so you can see new tasks after upload The parameters like prefix or matching filename regex could be changed any time from the webapp interface. +> Note: Choose target storage carefully: be sure it's empty when you just start labeling project, or it contains completions that match previously created/import tasks from source storage. Tasks are synced with completions based on internal ids (keys in `source.json`/`target.json` files in your project directory), so if you accidentally connect to the target storage with existed completions with the same ids, you may fail with undefined behaviour. + ## Amazon S3 To connect your [S3](https://aws.amazon.com/s3) bucket with Label Studio, be sure you have programmatic access enabled. [Check this link](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#configuration) to learn more how to set up access to your S3 bucket. diff --git a/docs/source/guide/tasks.md b/docs/source/guide/tasks.md index 1e5e68467f15..a86c8d13c795 100644 --- a/docs/source/guide/tasks.md +++ b/docs/source/guide/tasks.md @@ -1,7 +1,7 @@ --- -title: Import tasks +title: Tasks type: guide -order: 101 +order: 102 --- ## Basic format @@ -176,6 +176,12 @@ http:///data/filename?d= Supported formats are: `.wav` `.aiff` `.mp3` `.au` `.flac` +### Upload resource files on Import page + +For label configs with one data key (e.g.: one input image) Label Studio supports a file uploading via GUI, +just drag & drop your files (or select them from file dialog) on "Import" page. +This option is suitable for limited file number. + ## Import using API @@ -186,6 +192,53 @@ curl -X POST -H Content-Type:application/json http://localhost:8080/api/import \ --data "[{\"my_key\": \"my_value_1\"}, {\"my_key\": \"my_value_2\"}]" ``` +## Retrieve tasks using API + +You can retrieve project settings including total task count using API in JSON format: + +```json +http:///api/project +``` + +Response example: + +```json +{ + ... + "task_count": 3, + ... +} +``` + +To get tasks with pagination in JSON format: + +``` +http:///api/tasks?page=1&page_size=10&order={-}[id|completed_at] +``` + +Response example: + +```json +[ + { + "completed_at": "2020-05-29 03:31:15", + "completions": [ + { + "created_at": 1590712275, + "id": 10001, + "lead_time": 4.0, + "result": [ ... ] + } + ], + "data": { + "image": "s3://htx-dev/dataset/training_set/dogs/dog.102.jpg" + }, + "id": 2, + "predictions": [] + } +] +``` + ## Sampling You can define the way of how your imported tasks are exposed to annotators. Several options are available. To enable one of them, specify `--sampling=