akride package
Subpackages
- akride.core package- Subpackages- akride.core.conf package
- akride.core.entities package- Submodules
- akride.core.entities.bgc_job module
- akride.core.entities.catalogs module
- akride.core.entities.containers module
- akride.core.entities.datasets module
- akride.core.entities.docker_image module
- akride.core.entities.docker_pipeline module
- akride.core.entities.docker_repository module
- akride.core.entities.entity module
- akride.core.entities.jobs module
- akride.core.entities.pipeline module
- akride.core.entities.resultsets module
- akride.core.entities.sms_secrets module
- Module contents
 
- akride.core.models package
 
- Submodules
- akride.core.constants module- Constants- Constants.AKRIDE_TMP_DIR
- Constants.DATASET_FILES_COLUMNS
- Constants.DEBUGGING_ENABLED
- Constants.DEFAULT_IMAGE_BLOB_EXPR
- Constants.DEFAULT_SAAS_ENDPOINT
- Constants.DEFAULT_VIDEO_BLOB_EXPR
- Constants.IMPORT_CATALOG_STATUS_CHECK_ATTEMPTS
- Constants.IMPORT_CATALOG_STATUS_CHECK_INTERVAL_S
- Constants.INGEST_IMAGE_PARTITION_SIZE
- Constants.INGEST_IMAGE_WF_TOKEN_SIZE
- Constants.INGEST_VIDEO_PARTITION_SIZE
- Constants.INGEST_VIDEO_WF_TOKEN_SIZE
- Constants.LOG_CONFIG_FILE_NAME
- Constants.PARTITIONED_TABLE_COLUMNS
- Constants.PARTITION_TIME_FRAME
- Constants.PROCESS_IMAGE_WF_TOKEN_SIZE
- Constants.PROCESS_VIDEO_WF_TOKEN_SIZE
- Constants.THUMBNAIL_AGGREGATOR_SDK_DETAILS
- Constants.VIDEO_CHUNK_SIZE
 
 
- akride.core.enums module
- akride.core.exceptions module
- akride.core.types module- AnalyzeJobParams
- CatalogDetails- CatalogDetails.ground_truth_class_column
- CatalogDetails.ground_truth_coordinates_class_column
- CatalogDetails.ground_truth_coordinates_column
- CatalogDetails.prediction_class_column
- CatalogDetails.prediction_coordinates_class_score_column
- CatalogDetails.prediction_coordinates_column
- CatalogDetails.score_column
 
- CatalogTable
- ClientManager
- ClusterRetrievalSpec
- Column
- ConfusionMatrix
- ConfusionMatrixCellSpec
- CoresetSamplingSpec
- JobOpSpec
- JobStatistics
- JoinCondition
- PlotFeaturizer
- SampleInfoList
- SimilaritySearchSpec
 
- Module contents
 
- Subpackages
Submodules
akride.background_task_manager module
- class akride.background_task_manager.BackgroundTaskManager[source]
- Bases: - object- Helper class to manage background task - is_task_running(entity_id: str, task_type: BackgroundTaskType) bool[source]
- :param : :type : param entity_id: Entity ID associated with the task. :param : :type : param task_type: The type of the background task. - Returns:
- a boolean representing whether task is running or not. 
- Return type:
- Boolean 
 
 - start_task(entity_id: str, task_type: BackgroundTaskType, target_function, *args, **kwargs) BackgroundTask[source]
- Start a background task. - :param : :type : param task_type: The type of the background task. :param : :type : param entity_id: Entity ID associated with the task :param : :type : param target_function: The target function to run :param : :type : param args: Arguments for the target function :param : :type : param kwargs: Keyword arguments for the target function - Returns:
- background task object 
- Return type:
- BackgroundTask 
 
 
akride.client module
Copyright (C) 2025, Akridata, Inc - All Rights Reserved. Unauthorized copying of this file, via any medium is strictly prohibited
- class akride.client.AkriDEClient(saas_endpoint: str | None = None, api_key: str | None = None, sdk_config_tuple: Tuple[str, str] | None = None, sdk_config_dict: dict | None = None, sdk_config_file: str | None = None)[source]
- Bases: - object- Client class to connect to DataExplorer - abort_bgc_jobs(dataset: Dataset, job: BGCJob | None = None)[source]
- Aborts background cataloging jobs for the dataset 
 - add_to_catalog(dataset: Dataset, table_name: str, csv_file_path: str, import_identifier: str | None = None) bool[source]
- Adds new items to an existing catalog. - Parameters:
- Returns:
- Indicates whether the operation was successful. 
- Return type:
 
 - attach_pipeline_to_dataset(pipeline_id, dataset_id, attachment_policy_type: AttachmentPolicyType | None = 'ON_DEMAND')[source]
- Attach pipeline based on a 
 - attach_pipelines(dataset: Dataset, featurizer_types: Set[FeaturizerType], attachment_policy_type: AttachmentPolicyType | None = 'PUSH_MODE')[source]
- Attach pipelines based on the featurizer types - Parameters:
- dataset (Dataset) – The dataset object to submit ingestion. 
- featurizer_types (Set[FeaturizerType]) – Featurizers to run for the dataset 
- attachment_policy_type (Optional[AttachmentPolicyType]) – Pipeline attachment policy type 
 
- Return type:
- None 
 
 - check_if_dataset_files_to_be_registered(dataset: Dataset, file_paths: List[str]) bool[source]
- Check if the files are not registered for the dataset 
 - create_dataset(spec: Dict[str, Any]) Entity[source]
- Creates a new dataset entity. - Parameters:
- spec (Dict[str, Any]) – - The dataset spec. The spec should have the following fields:
- dataset_namestr
- The name of the new dataset. 
- dataset_namespacestr, optional
- The namespace for the dataset, by default ‘default’. 
- data_typeDataType, optional
- The type of data to store in the dataset, by default DataType.IMAGE. 
- glob_patternstr, optional
- The glob pattern for the dataset, by default For image datasets: value =’*(png|jpg|gif|jpeg|tiff|tif|bmp)’. For video datasets: value = ‘*(mov|mp4|avi|wmv|mpg|mpeg|mkv)’ 
- sample_frame_rate: float, optional
- The frame rate per second (fps) for videos. Applicable only for video datasets. 
- overwritebool, optional
- Overwrite if a dataset with the same name exists. 
 
 
- Returns:
- The created entity 
- Return type:
 
 - create_docker_pipeline(spec: DockerPipelineSpec) DockerPipeline | None[source]
- Creates a Pipeline using the Docker Image - specDockerPipelineSpec
- Pipeline Specification 
 - DockerPipeline
- object representing the Docker Pipeline 
 
 - create_featurizer_image_spec(image_name: str, description: str, command: str, repository_name: str, properties: Dict[str, Any], gpu_filter: bool | None = None, gpu_mem_fraction: float | None = None, allow_no_gpu: bool | None = None, namespace: str | None = 'default', image_tag: str | None = 'latest', name: str | None = None) DockerImageSpec[source]
- Creates a DockerImageSpec object that specifies the Featurizer Docker Image to be created - Parameters:- image_namestr
- The name of the Docker Image present in the repository 
- descriptionstr
- A short description of the Docker Image 
- command: str
- Command that is used to run the featurizer docker 
- repository_name: str
- Name of the repository in DE, the Docker Image will be pulled from. 
- properties: Dict[str, Any]
- Properties specific to the Docker Image 
- gpu_filter: Optional[bool]
- Flag to specify if the Image can be on a GPU or not 
- gpu_mem_fraction: Optional[float]
- The GPU specifying the memory to be reserved for the Docker Image. Should be > 0 and <= 1 
- allow_no_gpu: Optional[bool]
- Flag to specify if the Image can also be run if no GPU is available 
- namespace: Optional[str]
- Namespace of the Docker Image, By default it will be ‘default’ 
- image_tag: Optional[str]
- Tag of the docker Image in the docker repository, be default it will be “latest” 
- name: Optional[str]
- Display name of the Docker Image on DE, by default it will be same as image_name 
 - returns:
- Object representing a Docker Image Specification 
- rtype:
- DockerImageSpec 
 
 - create_featurizer_pipeline_spec(pipeline_name: str, pipeline_description: str, featurizer_name: str, data_type: str | None = DataType.IMAGE, namespace: str | None = 'default') DockerPipelineSpec[source]
- Creates a DockerImageSpec object that specifies the Featurizer Docker Image to be created - Parameters:- pipeline_namestr
- The name of the Docker pipeline 
- pipeline_descriptionstr
- A short description of the Docker Pipeline 
- featurizer_name: str
- Docker Image name of the featurizer to uniquely identify the image. 
- data_type: Optional[str]
- Data Type of the pipeline, by default DataType.IMAGE. Allowed values are DataType.IMAGE, DataType.VIDEO 
- namespace: Optional[str]
- Namespace of the Docker Pipeline, By default it will be ‘default’ 
 - returns:
- Object representing a Docker Pipeline Specification 
- rtype:
- DockerPipelineSpec 
 
 - create_job(spec: JobSpec) Job[source]
- Creates an explore job for the specified dataset. - Parameters:- dataset: Dataset
- The dataset to explore. 
- spec: JobSpec
- The job specification. 
 - Returns:- Job
- The newly created Job object. 
 
 - create_job_spec(dataset: Dataset, job_type: str | JobType = 'EXPLORE', job_name: str = '', predictions_file: str = '', cluster_algo: str | ClusterAlgoType = ClusterAlgoType.HDBSCAN, embed_algo: str | EmbedAlgoType = EmbedAlgoType.UMAP, num_clusters: int | None = None, max_images: int = 1000, catalog_table: CatalogTable | None = None, analyze_params: AnalyzeJobParams | None = None, pipeline: Pipeline | None = None, filters: List[Condition] | None = None, reference_job: Job | None = None) JobSpec[source]
- Creates a JobSpec object that specifies how a job is to be created. - Parameters:- dataset: Dataset
- The dataset to explore. 
- job_typeJobType, optional
- The job type 
- job_namestr, optional
- The name of the job to create. A unique name will be generated if this is not given. 
- predictions_file: str, optional
- The path to the catalog file containing predictions and ground truth. This file must be formatted according to the specification at: - https://docs.akridata.ai/docs/analyze-job-creation-and-visualization 
- cluster_algoClusterAlgoType, optional
- The clustering algorithm to use. 
- embed_algoEmbedAlgoType, optional
- The embedding algorithm to use. 
- num_clustersint, optional
- The number of clusters to create. 
- max_imagesint, optional
- The maximum number of images to use. 
- catalog_table: CatalogTable, optional
- The catalog to be used for creating this explore job. This defaults to the internal primary catalog that is created automatically when a dataset is created. default: “primary” 
- analyze_params: AnalyzeJobParams, optional
- Analyze job related configuration parameters 
- filtersList[Condition], optional
- The filters to be used to select a subset of samples for this job. These filters are applied to the catalog specified by catalog_name. 
- reference_job: Job, optional
- The reference job for this compare job 
 
 - create_table(dataset: Dataset, table_name: str, schema: Dict[str, str], indices: List[str] | None = None) str[source]
- Adds and empty external catalog to the dataset. - Parameters:
- Returns:
- Returns the absolute table name for the external catalog. 
- Return type:
 
 - create_view(view_name: str, description: str | None, dataset: Dataset, left_table: CatalogTable, right_table: CatalogTable, join_condition: JoinCondition, inner_join: bool = False) str[source]
- Create a SQL view for visualization Note: Left join is used by default while creating the view - Parameters:
- view_name (str) – Name of the view to create 
- description (Optional[str]) – Description text 
- dataset (Dataset) – Dataset object 
- left_table (TableInfo) – Left Table of the create view query 
- right_table (TableInfo) – Right Table of the create view query 
- join_condition (JoinCondition) – JoinCondition which includes the 
- table (column from the left and the right) – 
- inner_join (bool) – Use inner join for joining the tables 
 
- Returns:
- view id 
- Return type:
 
 - get_all_columns(dataset: Dataset, table: CatalogTable) List[Column][source]
- Returns all columns for a table/view 
 - get_attached_pipelines(dataset: Dataset, version: str | None = None) List[Pipeline][source]
- Get pipelines attached for dataset given a dataset version 
 - get_bgc_attached_pipeline_progress_report(dataset: Dataset, pipeline: Pipeline) BGCAttachmentJobStatus[source]
- Get Background Catalog progress for the dataset attachment - Parameters:
- Returns:
- Background Catalog status for the dataset attachment 
- Return type:
 
 - get_catalog_by_name(dataset: Dataset, name: str) Entity | None[source]
- Retrieves a catalog with the given name. 
 - get_catalog_data_count(dataset: Dataset, table_name: str, filter_str: str | None = None) int[source]
- Retrieves the count of the number of rows in a catalog table based on filters 
 - get_catalog_tags(samples: SampleInfoList) DataFrame[source]
- Retrieves the catalog tags corresponding to the given samples. - Parameters:
- samples (SampleInfoList) – The samples to retrieve catalog tags for. 
- Returns:
- A dataframe of catalog tags. 
- Return type:
- pd.DataFrame 
 
 - get_catalogs(attributes: Dict[str, Any] = {}) List[Entity][source]
- Retrieves information about catalogs that have the given attributes. - Parameters:
- attributes (Dict[str, Any]) – - The filter specification. It may have the following optional fields: - namestr
- filter by catalog name 
- statusstr
- filter by catalog status, can be one of “active”,”inactive”, “refreshing”, “offline”, “invalid-config” 
 
- Returns:
- A list of Entity objects representing catalogs. 
- Return type:
- List[Entity] 
 
 - get_compatible_reference_jobs(dataset: Dataset, pipeline: Pipeline, catalog_table: CatalogTable, search_key: str | None = None) List[Job][source]
- Retrieves jobs created from a given catalog_table which can be used to create “JobType.COMPARE” job types - Parameters:
- Returns:
- A list of Entity objects representing jobs. 
- Return type:
- List[Entity] 
 
 - get_containers(attributes: Dict[str, Any] | None = None) List[Entity][source]
- Retrieves information about containers that have the given attributes. 
 - get_datasets(attributes: Dict[str, Any] = {}) List[Entity][source]
- Retrieves information about datasets that have the given attributes. 
 - get_files_to_be_processed(dataset: Dataset, pipeline: Pipeline, batch_size: int) DatasetUnprocessedFiles[source]
- Get files to be processed for the dataset - Parameters:
- Returns:
- Dataset files to be processed. 
- Return type:
 
 - get_fullres_image_urls(samples: SampleInfoList) Dict[source]
- Retrieves the full-resolution image urls for the give samples. - Parameters:
- samples (SampleInfoList) – The samples to retrieve full res image urls for. 
- Returns:
- A dictionary containing the full-resolution image URLs for each sample. 
- Return type:
- Dict 
 
 - get_fullres_images(samples: SampleInfoList) List[Image][source]
- Retrieves the full-resolution images for the provided job. - Parameters:
- samples (SampleInfoList) – The samples to retrieve images for. 
- Returns:
- A list of images. 
- Return type:
- List[Image.Image] 
 
 - get_job_samples(job: Job, job_context: JobContext, spec: SimilaritySearchSpec | ConfusionMatrixCellSpec | ClusterRetrievalSpec | CoresetSamplingSpec, **kwargs) SampleInfoList[source]
- Retrieves the samples according to the given specification. - Parameters:
- job (Job) – The Job object to get samples for. 
- job_context (JobContext) – The context in which the samples are requested for. 
- spec (Union[) – SimilaritySearchSpec, ConfusionMatrixCellSpec, ClusterRetrievalSpec, CoresetSamplingSpec 
- ] – The job context spec. 
- **kwargs (Additional keyword arguments) – 
- arguments (Supported keyword) – - iou_config_threshold: float, optional
- Threshold value for iou config 
- confidence_score_threshold: float, optional
- Threshold value for confidence score 
 
 
- Returns:
- A SampleInfoList object. 
- Return type:
 
 - get_job_samples_from_file_path(job: Job, file_info: List[str]) Dict[source]
- Retrieves the samples according to the given specification. 
 - get_job_statistics(job: Job, context: JobStatisticsContext, **kwargs) JobStatistics[source]
- Retrieves statistics info from an analyze job. - Parameters:
- job (Job) – The Job object to get statistics for. 
- context (JobStatisticsContext) – The type of statistics to retrieve. 
- **kwargs (Additional keyword arguments) – 
- arguments (Supported keyword) – - iou_config_threshold: float, optional
- Threshold value for iou config 
- confidence_score_threshold: float, optional
- Threshold value for confidence score 
 
 
- Returns:
- A job statistics object. 
- Return type:
 
 - get_jobs(attributes: Dict[str, Any] = {}) List[Entity][source]
- Retrieves information about jobs that have the given attributes. - Parameters:
- attributes (Dict[str, Any]) – - The filter specification. It may have the following optional fields: - data_typestr
- The data type to filter on. This can be ‘IMAGE’ or ‘VIDEO’. 
- job_typestr
- The job type to filter on - ‘EXPLORE’, ‘ANALYZE’ etc. 
- search_keystr
- Filter jobs across fields like job name, dataset id, and dataset name. 
 
- Returns:
- A list of Entity objects representing jobs. 
- Return type:
- List[Entity] 
 
 - get_progress_info(task: BackgroundTask) ProgressInfo[source]
- Gets the progress of the specified task. - Parameters:
- task (BackgroundTask) – The task object to retrieve the progress information for. 
- Returns:
- The progress information 
- Return type:
 
 - get_repository_by_name(name: str) Entity | None[source]
- Retrieves a Docker repository with the given name. 
 - get_resultset_by_id(resultset_id: str) Entity[source]
- Retrieves a resultset with the given identifier. 
 - get_resultset_samples(resultset: Resultset, max_sample_size: int = 10000) SampleInfoList[source]
- Retrieves the samples of a resultset - Parameters:
- resultset (Resultset) – The Resultset object to get samples for. 
- Returns:
- A SampleInfoList object. 
- Return type:
 
 - get_resultsets(attributes: Dict[str, Any] = {}) List[Entity][source]
- Retrieves information about resultsets that have the given attributes. 
 - get_secrets(name: str, namespace: str) SMSSecrets | None[source]
- Retrieves information about SMS Secret for the given SMS secret name and namespace. - Parameters:
- Returns:
- Object representing Secrets. 
- Return type:
 
 - get_server_version() str[source]
- Get Dataexplorer server version - Returns:
- server version 
- Return type:
 
 - get_thumbnail_images(samples: SampleInfoList) List[Image][source]
- Retrieves the thumbnail images corresponding to the samples. - Parameters:
- samples (SampleInfoList) – The samples to retrieve thumbnails for. 
- Returns:
- A list of thumbnail images. 
- Return type:
- List[Image.Image] 
 
 - get_view_id(dataset: Dataset, view_name: str) CatalogViewInfo | None[source]
- Retrieves the view id for a view of a dataset - Parameters:
- Returns:
- Returns the CatalogViewInfo object 
- Return type:
- Optional[CatalogViewInfo] 
 
 - import_catalog(dataset: Dataset, table_name: str, csv_file_path: str, create_view: bool = True, file_name_column: str | None = None, pipeline_name: str | None = None, import_identifier: str | None = None) bool[source]
- Method for importing an external catalog into a dataset. - Parameters:
- dataset (Dataset) – The dataset to import the catalog into. 
- table_name (str) – The name of the table to create for the catalog. 
- csv_file_path (str) – The path to the CSV file containing the catalog data. 
- create_view (bool default: True) – Create a view with imported catalog and primary catalog table 
- file_name_column (str) – Name of the column in the csv file that contains the absolute filename 
- pipeline_name (str) – Name of pipeline whose primary table will be joined with the imported table. Ignored if create_view is false 
- import_identifier (str) – Unique identifier for importing data 
 
- Returns:
- Indicates whether the operation was successful. 
- Return type:
 
 - ingest_dataset(dataset: Dataset, data_directory: str, use_patch_featurizer: bool = True, with_clip_featurizer: bool = False, async_req: bool = False, catalog_details: CatalogDetails | None = None) BackgroundTask | None[source]
- Starts an asynchronous ingest task for the specified dataset. - Parameters:
- dataset (Dataset) – The dataset to ingest. 
- data_directory (str) – The path to the directory containing the dataset files. 
- use_patch_featurizer (bool, optional) – Ingest dataset to enable patch-based similarity searches. 
- with_clip_featurizer (bool, optional) – Ingest dataset to enable text prompt based search. 
- async_req (bool, optional) – Whether to execute the request asynchronously. 
- catalog_details (Optional[CatalogDetails]) – Parameters details for creating a catalog 
 
- Returns:
- A task object 
- Return type:
- BackgroundTask 
 
 - register_docker_image(spec: DockerImageSpec) DockerImage | None[source]
- Registers a Docker Image - specDockerImageSpec
- Docker Image Specification 
 - DockerImage
- Object representing the Docker Image 
 
 - submit_bgc_job(dataset: Dataset, pipelines: List[Pipeline]) BGCJob[source]
- Submits a Background Cataloging Job for the dataset 
 - update_resultset(resultset: Resultset, add_list: SampleInfoList | None = None, del_list: SampleInfoList | None = None) bool[source]
- Updates a resultset. - Parameters:
- resultset (Resultset) – The resultset to be updated. 
- add_list (SampleInfoList, optional) – The list of samples to be added. 
- del_list (SampleInfoList, optional) – The list of samples to be deleted. 
 
- Returns:
- Indicates whether the operation was successful. 
- Return type:
 
 - wait_for_completion(task: BackgroundTask) ProgressInfo[source]
- Waits for the specified task to complete. - Parameters:
- task (BackgroundTask) – The ID of the job to wait for. 
- Returns:
- The progress information 
- Return type:
 
 
akride.main module
Module contents
- akride.init(sdk_config_tuple: Tuple[str, str] | None = None, sdk_config_dict: dict | None = None, sdk_config_file: str | None = '') AkriDEClient[source]
- Initializes the AkriDEClient with the saas_endpoint and api_key values The init params could be passed in different ways, incase multiple options are used to pass the init params the order of preference would be 1. sdk_config_tuple, 2. sdk_config 3. sdk_config_file - Get the config by signing in to Data Explorer UI and navigating to Utilities → Get CLI/SDK config :param sdk_config_tuple: A tuple consisting of saas_endpoint and api_key in that order :type sdk_config_tuple: tuple :param sdk_config_dict: dictionary containing “saas_endpoint” and “api_key” :type sdk_config_dict: dict :param sdk_config_file: Path to the the SDK config file downloaded from Dataexplorer :type sdk_config_file: str - Raises:
- InvalidAuthConfigError – if api-key/host is invalid: 
- ServerNotReachableError – if the server is unreachable: