Model¶

Some methods of tdclient.client.Client returns model object which represents results from REST API.

tdclient.model¶

class tdclient.model.Model(client)[source]¶

Bases: object

property client¶

a tdclient.client.Client instance

Type: Returns

tdclient.models¶

tdclient.models.BulkImport = <class 'tdclient.bulk_import_model.BulkImport'>[source]¶: Bulk-import session on Treasure Data Service

tdclient.models.Database = <class 'tdclient.database_model.Database'>[source]¶: Database on Treasure Data Service

tdclient.models.Schema = <class 'tdclient.job_model.Schema'>[source]¶: Schema of a database table on Treasure Data Service

tdclient.models.Job = <class 'tdclient.job_model.Job'>[source]¶: Job on Treasure Data Service

tdclient.models.Result = <class 'tdclient.result_model.Result'>[source]¶: Result on Treasure Data Service

tdclient.models.ScheduledJob = <class 'tdclient.schedule_model.ScheduledJob'>[source]¶: Scheduled job on Treasure Data Service

tdclient.models.Schedule = <class 'tdclient.schedule_model.Schedule'>[source]¶: Schedule on Treasure Data Service

tdclient.models.Table = <class 'tdclient.table_model.Table'>[source]¶: Database table on Treasure Data Service

tdclient.models.User = <class 'tdclient.user_model.User'>[source]¶: User on Treasure Data Service

tdclient.bulk_import_model¶

class tdclient.bulk_import_model.BulkImport(client, **kwargs)[source]¶

Bases: tdclient.model.Model

Bulk-import session on Treasure Data Service

commit(wait=False, wait_interval=5, timeout=None)[source]¶: Commit bulk import

delete()[source]¶: Delete bulk import

delete_part(part_name)[source]¶

Delete a part of a Bulk Import session

Parameters: part_name (str) – name of a part of the bulk import session
Returns: True if succeeded.

error_record_items()[source]¶

Fetch error record rows.

Yields: Error record

freeze()[source]¶: Freeze bulk import

list_parts()[source]¶

Return the list of available parts uploaded through bulk_import_upload_part().

Returns: The list of bulk import part name.
Return type: [str]

perform(wait=False, wait_interval=5, wait_callback=None)[source]¶

Perform bulk import

Parameters

wait (bool, optional) – Flag for wait bulk import job. Default False
wait_interval (int, optional) – wait interval in second. Default 5.
wait_callback (callable, optional) – A callable to be called on every tick of wait interval.

unfreeze()[source]¶: Unfreeze bulk import

update()[source]¶

upload_file(part_name, fmt, file_like, **kwargs)[source]¶

Upload a part to Bulk Import session, from an existing file on filesystem.

Parameters

part_name (str) – name of a part of the bulk import session
fmt (str) – format of data type (e.g. “msgpack”, “json”, “csv”, “tsv”)
file_like (str or file-like) – the name of a file, or a file-like object, containing the data
**kwargs – extra arguments.

There is more documentation on fmt, file_like and **kwargs at file import parameters.

In particular, for “csv” and “tsv” data, you can change how data columns are parsed using the dtypes and converters arguments.

dtypes is a dictionary used to specify a datatype for individual columns, for instance {"col1": "int"}. The available datatypes are "bool", "float", "int", "str" and "guess". If a column is also mentioned in converters, then the function will be used, NOT the datatype.
converters is a dictionary used to specify a function that will be used to parse individual columns, for instace {"col1", int}.

The default behaviour is "guess", which makes a best-effort to decide the column datatype. See file import parameters for more details.

upload_part(part_name, bytes_or_stream, size)[source]¶

Upload a part to bulk import session

Parameters

part_name (str) – name of a part of the bulk import session
bytes_or_stream (file-like) – a file-like object contains the part
size (int) – the size of the part

STATUS_COMMITTED = 'committed'¶

STATUS_COMMITTING = 'committing'¶

STATUS_PERFORMING = 'performing'¶

STATUS_READY = 'ready'¶

STATUS_UPLOADING = 'uploading'¶

property database¶: A database name in a string which the bulk import session is working on

property error_parts¶: The number of error parts.

property error_records¶: The number of error records.

property job_id¶: Job ID

property name¶: A name of the bulk import session

property status¶: The status of the bulk import session in a string

property table¶: A table name in a string which the bulk import session is working on

property upload_frozen¶: The number of upload frozen.

property valid_parts¶: The number of valid parts.

property valid_records¶: The number of valid records.

tdclient.database_model¶

class tdclient.database_model.Database(client, db_name, **kwargs)[source]¶

Bases: tdclient.model.Model

Database on Treasure Data Service

create_log_table(name)[source]¶

Parameters: name (str) – name of new log table
Returns: tdclient.model.Table

delete()[source]¶

Delete the database

Returns: True if success

query(q, **kwargs)[source]¶

Run a query on the database

Parameters: q (str) – a query string
Returns: tdclient.model.Job

table(table_name)[source]¶

Parameters: table_name (str) – name of a table
Returns: tdclient.model.Table

tables()[source]¶

Returns: a list of tdclient.model.Table

PERMISSIONS = ['administrator', 'full_access', 'import_only', 'query_only']¶

PERMISSION_LIST_TABLES = ['administrator', 'full_access']¶

property count¶

Total record counts in a database.

Type: int

property created_at¶: datetime.datetime

property name¶

a name of the database

Type: str

property org_name¶

organization name

Type: str

property permission¶

permission for the database (e.g. “administrator”, “full_access”, etc.)

Type: str

property updated_at¶: datetime.datetime

tdclient.job_model¶

class tdclient.job_model.Job(client, job_id, type, query, **kwargs)[source]¶

Bases: tdclient.model.Model

Job on Treasure Data Service

error()[source]¶

Returns: True if the job has been finished in error

finished()[source]¶

Returns: True if the job has been finished in success, error or killed

kill()[source]¶

Kill the job

Returns: a string represents the status of killed job (“queued”, “running”)

killed()[source]¶

Returns: True if the job has been finished in killed

queued()[source]¶

Returns: True if the job is queued

result()[source]¶

Yields: an iterator of rows in result set

result_format(fmt)[source]¶

Parameters: fmt (str) – output format of result set
Yields: an iterator of rows in result set

running()[source]¶

Returns: True if the job is running

status()[source]¶

Returns: a string represents the status of the job (“success”, “error”, “killed”, “queued”, “running”)
Return type: str

success()[source]¶

Returns: True if the job has been finished in success

update()[source]¶: Update all fields of the job

wait(timeout=None, wait_interval=5, wait_callback=None)[source]¶

Sleep until the job has been finished

Parameters

timeout (int, optional) – Timeout in seconds. No timeout by default.
wait_interval (int, optional) – wait interval in second. Default 5 seconds.
wait_callback (callable, optional) – A callable to be called on every tick of wait interval.

FINISHED_STATUS = ['success', 'error', 'killed']¶

JOB_PRIORITY = {-2: 'VERY LOW', -1: 'LOW', 0: 'NORMAL', 1: 'HIGH', 2: 'VERY HIGH'}¶

STATUS_BOOTING = 'booting'¶

STATUS_ERROR = 'error'¶

STATUS_KILLED = 'killed'¶

STATUS_QUEUED = 'queued'¶

STATUS_RUNNING = 'running'¶

STATUS_SUCCESS = 'success'¶

property database¶: a string represents the name of a database that job is running on

property debug¶: a dict of debug output (e.g. “cmdout”, “stderr”)

property id¶: a string represents the identifier of the job

property job_id¶: a string represents the identifier of the job

property linked_result_export_job_id¶: Linked result export job ID from query job

property num_records¶: the number of records of job result

property org_name¶: organization name

property priority¶: a string represents the priority of the job (e.g. “NORMAL”, “HIGH”, etc.)

property query¶: a string represents the query string of the job

property result_export_target_job_id¶: Associated query job ID from result export job ID

property result_schema¶: an array of array represents the type of result columns (Hive specific) (e.g. [[“_c1”, “string”], [“_c2”, “bigint”]])

property result_size¶: the length of job result

property result_url¶: a string of URL of the result on Treasure Data Service

property retry_limit¶: a number for automatic retry count

property type¶: a string represents the engine type of the job (e.g. “hive”, “presto”, etc.)

property url¶: a string of URL of the job on Treasure Data Service

property user_name¶: executing user name

class tdclient.job_model.Schema(fields=None)[source]¶

Bases: object

Schema of a database table on Treasure Data Service

class Field(name, type)[source]¶

Bases: object

property name¶

add docstring

Type: TODO

property type¶

add docstring

Type: TODO

add_field(name, type)[source]¶: TODO: add docstring

property fields¶

add docstring

Type: TODO

tdclient.result_model¶

class tdclient.result_model.Result(client, name, url, org_name)[source]¶

Bases: tdclient.model.Model

Result on Treasure Data Service

property name¶

a name for a authentication

Type: str

property org_name¶

organization name

Type: str

property url¶

a result output URL

Type: str

tdclient.schedule_model¶

class tdclient.schedule_model.Schedule(client, *args, **kwargs)[source]¶

Bases: tdclient.model.Model

Schedule on Treasure Data Service

run(time, num=None)[source]¶

Run a scheduled job

Parameters

time (int) – Time in Unix epoch format that would be set as TD_SCHEDULED_TIME
num (int) – Indicates how many times the query will be executed. Value should be 9 or less.

Returns

[tdclient.models.ScheduledJob]

property created_at¶

Create date

Type: datetime.datetime

property cron¶

The configured schedule of a scheduled job.

Returns a string represents the schedule in cron form, or None if the job is not scheduled to run (saved query)

property database¶: The target database of a scheduled job

property delay¶: A delay ensures all buffered events are imported before running the query.

property name¶: The name of a scheduled job

property next_time¶

Schedule for next run

Type: datetime.datetime

property org_name¶

add docstring

Type: TODO

property priority¶: The priority of a scheduled job

property query¶: The query string of a scheduled job

property result_url¶: The result output configuration in URL form of a scheduled job

property retry_limit¶: Automatic retry count.

property timezone¶: The time zone of a scheduled job

property type¶: Query type. {“presto”, “hive”}.

property user_name¶: User name of a scheduled job

class tdclient.schedule_model.ScheduledJob(client, scheduled_at, job_id, type, query, **kwargs)[source]¶

Bases: tdclient.job_model.Job

Scheduled job on Treasure Data Service

property scheduled_at¶: a datetime.datetime represents the schedule of next invocation of the job

tdclient.table_model¶

class tdclient.table_model.Table(*args, **kwargs)[source]¶

Bases: tdclient.model.Model

Database table on Treasure Data Service

delete()[source]¶: a string represents the type of deleted table

export_data(storage_type, **kwargs)[source]¶

Export data from Treasure Data Service

Parameters

storage_type (str) – type of the storage
**kwargs (dict) –
optional parameters. Assuming the following keys:
- access_key_id (str):
  ID to access the information to be exported.
- secret_access_key (str):
  Password for the access_key_id.
- file_prefix (str, optional):
  Filename of exported file. Default: “<database_name>/<table_name>”
- file_format (str, optional):
  File format of the information to be exported. {“jsonl.gz”, “tsv.gz”, “json.gz”}
- from (int, optional):
  From Time of the data to be exported in Unix epoch format.
- to (int, optional):
  End Time of the data to be exported in Unix epoch format.
- assume_role (str, optional):
  Assume role.
- bucket (str):
  Name of bucket to be used.
- domain_key (str, optional):
  Job domain key.
- pool_name (str, optional):
  For Presto only. Pool name to be used, if not specified, default pool would be used.

Returns

tdclient.models.Job

import_data(format, bytes_or_stream, size, unique_id=None)[source]¶

Import data into Treasure Data Service

Parameters

format (str) – format of data type (e.g. “msgpack.gz”)
bytes_or_stream (str or file-like) – a byte string or a file-like object contains the data
size (int) – the length of the data
unique_id (str) – a unique identifier of the data

Returns

second in float represents elapsed time to import data

import_file(format, file, unique_id=None)[source]¶

Import data into Treasure Data Service, from an existing file on filesystem.

This method will decompress/deserialize records from given file, and then convert it into format acceptable from Treasure Data Service (“msgpack.gz”).

Parameters

file (str or file-like) – a name of a file, or a file-like object contains the data
unique_id (str) – a unique identifier of the data

Returns

float represents the elapsed time to import data

tail(count, to=None, _from=None)[source]¶

Parameters

count (int) – Number for record to show up from the end.
to – Deprecated parameter.
_from – Deprecated parameter.

Returns

the contents of the table in reverse order based on the registered time (last data first).

property count¶

total number of the table

Type: int

property created_at¶

Created datetime

Type: datetime.datetime

property database_name¶: a string represents the name of the database

property db_name¶: a string represents the name of the database

property estimated_storage_size¶: estimated storage size

property estimated_storage_size_string¶: a string represents estimated size of the table in human-readable format

property expire_days¶: an int represents the days until expiration

property identifier¶: a string identifier of the table

property last_import¶: datetime.datetime

property last_log_timestamp¶: datetime.datetime

property name¶: a string represents the name of the table

property permission¶

permission for the database (e.g. “administrator”, “full_access”, etc.)

Type: str

property primary_key¶

add docstring

Type: TODO

property primary_key_type¶

add docstring

Type: TODO

property schema¶

str, alias:str]]: The list of a schema

Type: [[column_name
Type: str, column_type

property table_name¶: a string represents the name of the table

property type¶: a string represents the type of the table

property updated_at¶

Updated datetime

Type: datetime.datetime