Data Collection SDK

The collect_data function empowers users to capture frames from a designated source at specified intervals. Users have the flexibility to configure the frequency of frame collection, determine the number of frames to collect, and optionally upload the gathered frames to a cloud storage service such as AWS S3. The collected frames are subsequently stored in the Label Studio project, where they can be utilized for training or retraining models.

Parameter	Type	Default	Required	Description	When to Use	Accepted Values
source_id	Integer	None	Yes	Unique identifier for the video/camera source.	Always required.	Any valid source ID
time_interval	Integer	None	No	Time interval (in seconds) for frame collection.	Use when you want to collect frames based on time, not count.	Any positive integer
number_of_frames_requested	Integer	None	No	Total number of frames to collect.	Use when you want to collect a specific number of frames.	Any positive integer
skip_frame_count	Integer	5	No	Number of frames to skip between captures.	Use to reduce processing and avoid redundant data.	Any positive integer
s3_bucket_name	String	None	No	Name of the AWS S3 bucket.	Use if you want to upload collected frames to AWS.	Valid AWS S3 bucket name
s3_access_key	String	None	No	AWS access key for the specified S3 bucket.	Required only when using custom AWS credentials.	Valid AWS credentials
s3_secret_key	String	None	No	AWS secret key for the specified S3 bucket.	Required only when using custom AWS credentials.	Valid AWS credentials
s3_cloud_path	String	None	No	S3 path to upload the collected frames.	Use if uploading frames to a specific location in the S3 bucket.	Valid S3 path (e.g., s3://bucket-name/path/)

Collect Data from a Source:

Description:

The collect_data method enables users to extract frames from a specified source ID over a defined time interval. Users can also specify the number of frames to capture, skip frames for optimization, and optionally store the collected frames in an AWS S3 bucket.

Example 1: Collecting data based on the time interval

data_collect = sdk.collect_data(

source_id=174,

time_interval=10, # 10 seconds

skip_frame_count=5, # Optional

)

print(data_collect)

Example 2: Collecting data based on the number of frames

data_collect = sdk.collect_data(

source_id=174,

number_of_frames_requested=200,

skip_frame_count=5, # Optional

)

print(data_collect)

Once the data has been collected, users will receive a Label Studio URL where they can annotate the images and export the dataset in a zip format upon completing the annotations. This zip file will be utilized for training or retraining processes.

Collect Data from a Source:​

Description:​

Example 2: Collecting data based on the number of frames​

Collect Data from a Source:

Description:

Example 2: Collecting data based on the number of frames