Model

Some methods of tdclient.client.Client returns model object which represents results from REST API.

tdclient.model

class tdclient.model.Model(client)[source]

Bases: object

property client

a tdclient.client.Client instance

Type

Returns

tdclient.models

tdclient.models.BulkImport = <class 'tdclient.bulk_import_model.BulkImport'>[source]

Bulk-import session on Treasure Data Service

tdclient.models.Database = <class 'tdclient.database_model.Database'>[source]

Database on Treasure Data Service

tdclient.models.Schema = <class 'tdclient.job_model.Schema'>[source]

Schema of a database table on Treasure Data Service

tdclient.models.Job = <class 'tdclient.job_model.Job'>[source]

Job on Treasure Data Service

tdclient.models.Result = <class 'tdclient.result_model.Result'>[source]

Result on Treasure Data Service

tdclient.models.ScheduledJob = <class 'tdclient.schedule_model.ScheduledJob'>[source]

Scheduled job on Treasure Data Service

tdclient.models.Schedule = <class 'tdclient.schedule_model.Schedule'>[source]

Schedule on Treasure Data Service

tdclient.models.Table = <class 'tdclient.table_model.Table'>[source]

Database table on Treasure Data Service

tdclient.models.User = <class 'tdclient.user_model.User'>[source]

User on Treasure Data Service

tdclient.bulk_import_model

class tdclient.bulk_import_model.BulkImport(client, **kwargs)[source]

Bases: tdclient.model.Model

Bulk-import session on Treasure Data Service

commit(wait=False, wait_interval=5, timeout=None)[source]

Commit bulk import

delete()[source]

Delete bulk import

delete_part(part_name)[source]

Delete a part of a Bulk Import session

Parameters

part_name (str) – name of a part of the bulk import session

Returns

True if succeeded.

error_record_items()[source]

Fetch error record rows.

Yields

Error record

freeze()[source]

Freeze bulk import

list_parts()[source]

Return the list of available parts uploaded through bulk_import_upload_part().

Returns

The list of bulk import part name.

Return type

[str]

perform(wait=False, wait_interval=5, wait_callback=None)[source]

Perform bulk import

Parameters
  • wait (bool, optional) – Flag for wait bulk import job. Default False

  • wait_interval (int, optional) – wait interval in second. Default 5.

  • wait_callback (callable, optional) – A callable to be called on every tick of wait interval.

unfreeze()[source]

Unfreeze bulk import

update()[source]
upload_file(part_name, fmt, file_like, **kwargs)[source]

Upload a part to Bulk Import session, from an existing file on filesystem.

Parameters
  • part_name (str) – name of a part of the bulk import session

  • fmt (str) – format of data type (e.g. “msgpack”, “json”, “csv”, “tsv”)

  • file_like (str or file-like) – the name of a file, or a file-like object, containing the data

  • **kwargs – extra arguments.

There is more documentation on fmt, file_like and **kwargs at file import parameters.

In particular, for “csv” and “tsv” data, you can change how data columns are parsed using the dtypes and converters arguments.

  • dtypes is a dictionary used to specify a datatype for individual columns, for instance {"col1": "int"}. The available datatypes are "bool", "float", "int", "str" and "guess". If a column is also mentioned in converters, then the function will be used, NOT the datatype.

  • converters is a dictionary used to specify a function that will be used to parse individual columns, for instace {"col1", int}.

The default behaviour is "guess", which makes a best-effort to decide the column datatype. See file import parameters for more details.

upload_part(part_name, bytes_or_stream, size)[source]

Upload a part to bulk import session

Parameters
  • part_name (str) – name of a part of the bulk import session

  • bytes_or_stream (file-like) – a file-like object contains the part

  • size (int) – the size of the part

STATUS_COMMITTED = 'committed'
STATUS_COMMITTING = 'committing'
STATUS_PERFORMING = 'performing'
STATUS_READY = 'ready'
STATUS_UPLOADING = 'uploading'
property database

A database name in a string which the bulk import session is working on

property error_parts

The number of error parts.

property error_records

The number of error records.

property job_id

Job ID

property name

A name of the bulk import session

property status

The status of the bulk import session in a string

property table

A table name in a string which the bulk import session is working on

property upload_frozen

The number of upload frozen.

property valid_parts

The number of valid parts.

property valid_records

The number of valid records.

tdclient.database_model

class tdclient.database_model.Database(client, db_name, **kwargs)[source]

Bases: tdclient.model.Model

Database on Treasure Data Service

create_log_table(name)[source]
Parameters

name (str) – name of new log table

Returns

tdclient.model.Table

delete()[source]

Delete the database

Returns

True if success

query(q, **kwargs)[source]

Run a query on the database

Parameters

q (str) – a query string

Returns

tdclient.model.Job

table(table_name)[source]
Parameters

table_name (str) – name of a table

Returns

tdclient.model.Table

tables()[source]
Returns

a list of tdclient.model.Table

PERMISSIONS = ['administrator', 'full_access', 'import_only', 'query_only']
PERMISSION_LIST_TABLES = ['administrator', 'full_access']
property count

Total record counts in a database.

Type

int

property created_at

datetime.datetime

property name

a name of the database

Type

str

property org_name

organization name

Type

str

property permission

permission for the database (e.g. “administrator”, “full_access”, etc.)

Type

str

property updated_at

datetime.datetime

tdclient.job_model

class tdclient.job_model.Job(client, job_id, type, query, **kwargs)[source]

Bases: tdclient.model.Model

Job on Treasure Data Service

error()[source]
Returns

True if the job has been finished in error

finished()[source]
Returns

True if the job has been finished in success, error or killed

kill()[source]

Kill the job

Returns

a string represents the status of killed job (“queued”, “running”)

killed()[source]
Returns

True if the job has been finished in killed

queued()[source]
Returns

True if the job is queued

result()[source]
Yields

an iterator of rows in result set

result_format(fmt)[source]
Parameters

fmt (str) – output format of result set

Yields

an iterator of rows in result set

running()[source]
Returns

True if the job is running

status()[source]
Returns

a string represents the status of the job (“success”, “error”, “killed”, “queued”, “running”)

Return type

str

success()[source]
Returns

True if the job has been finished in success

update()[source]

Update all fields of the job

wait(timeout=None, wait_interval=5, wait_callback=None)[source]

Sleep until the job has been finished

Parameters
  • timeout (int, optional) – Timeout in seconds. No timeout by default.

  • wait_interval (int, optional) – wait interval in second. Default 5 seconds.

  • wait_callback (callable, optional) – A callable to be called on every tick of wait interval.

FINISHED_STATUS = ['success', 'error', 'killed']
JOB_PRIORITY = {-2: 'VERY LOW', -1: 'LOW', 0: 'NORMAL', 1: 'HIGH', 2: 'VERY HIGH'}
STATUS_BOOTING = 'booting'
STATUS_ERROR = 'error'
STATUS_KILLED = 'killed'
STATUS_QUEUED = 'queued'
STATUS_RUNNING = 'running'
STATUS_SUCCESS = 'success'
property database

a string represents the name of a database that job is running on

property debug

a dict of debug output (e.g. “cmdout”, “stderr”)

property id

a string represents the identifier of the job

property job_id

a string represents the identifier of the job

property linked_result_export_job_id

Linked result export job ID from query job

property num_records

the number of records of job result

property org_name

organization name

property priority

a string represents the priority of the job (e.g. “NORMAL”, “HIGH”, etc.)

property query

a string represents the query string of the job

property result_export_target_job_id

Associated query job ID from result export job ID

property result_schema

an array of array represents the type of result columns (Hive specific) (e.g. [[“_c1”, “string”], [“_c2”, “bigint”]])

property result_size

the length of job result

property result_url

a string of URL of the result on Treasure Data Service

property retry_limit

a number for automatic retry count

property type

a string represents the engine type of the job (e.g. “hive”, “presto”, etc.)

property url

a string of URL of the job on Treasure Data Service

property user_name

executing user name

class tdclient.job_model.Schema(fields=None)[source]

Bases: object

Schema of a database table on Treasure Data Service

class Field(name, type)[source]

Bases: object

property name

add docstring

Type

TODO

property type

add docstring

Type

TODO

add_field(name, type)[source]

TODO: add docstring

property fields

add docstring

Type

TODO

tdclient.result_model

class tdclient.result_model.Result(client, name, url, org_name)[source]

Bases: tdclient.model.Model

Result on Treasure Data Service

property name

a name for a authentication

Type

str

property org_name

organization name

Type

str

property url

a result output URL

Type

str

tdclient.schedule_model

class tdclient.schedule_model.Schedule(client, *args, **kwargs)[source]

Bases: tdclient.model.Model

Schedule on Treasure Data Service

run(time, num=None)[source]

Run a scheduled job

Parameters
  • time (int) – Time in Unix epoch format that would be set as TD_SCHEDULED_TIME

  • num (int) – Indicates how many times the query will be executed. Value should be 9 or less.

Returns

[tdclient.models.ScheduledJob]

property created_at

Create date

Type

datetime.datetime

property cron

The configured schedule of a scheduled job.

Returns a string represents the schedule in cron form, or None if the job is not scheduled to run (saved query)

property database

The target database of a scheduled job

property delay

A delay ensures all buffered events are imported before running the query.

property name

The name of a scheduled job

property next_time

Schedule for next run

Type

datetime.datetime

property org_name

add docstring

Type

TODO

property priority

The priority of a scheduled job

property query

The query string of a scheduled job

property result_url

The result output configuration in URL form of a scheduled job

property retry_limit

Automatic retry count.

property timezone

The time zone of a scheduled job

property type

Query type. {“presto”, “hive”}.

property user_name

User name of a scheduled job

class tdclient.schedule_model.ScheduledJob(client, scheduled_at, job_id, type, query, **kwargs)[source]

Bases: tdclient.job_model.Job

Scheduled job on Treasure Data Service

property scheduled_at

a datetime.datetime represents the schedule of next invocation of the job

tdclient.table_model

class tdclient.table_model.Table(*args, **kwargs)[source]

Bases: tdclient.model.Model

Database table on Treasure Data Service

delete()[source]

a string represents the type of deleted table

export_data(storage_type, **kwargs)[source]

Export data from Treasure Data Service

Parameters
  • storage_type (str) – type of the storage

  • **kwargs (dict) –

    optional parameters. Assuming the following keys:

    • access_key_id (str):

      ID to access the information to be exported.

    • secret_access_key (str):

      Password for the access_key_id.

    • file_prefix (str, optional):

      Filename of exported file. Default: “<database_name>/<table_name>”

    • file_format (str, optional):

      File format of the information to be exported. {“jsonl.gz”, “tsv.gz”, “json.gz”}

    • from (int, optional):

      From Time of the data to be exported in Unix epoch format.

    • to (int, optional):

      End Time of the data to be exported in Unix epoch format.

    • assume_role (str, optional):

      Assume role.

    • bucket (str):

      Name of bucket to be used.

    • domain_key (str, optional):

      Job domain key.

    • pool_name (str, optional):

      For Presto only. Pool name to be used, if not specified, default pool would be used.

Returns

tdclient.models.Job

import_data(format, bytes_or_stream, size, unique_id=None)[source]

Import data into Treasure Data Service

Parameters
  • format (str) – format of data type (e.g. “msgpack.gz”)

  • bytes_or_stream (str or file-like) – a byte string or a file-like object contains the data

  • size (int) – the length of the data

  • unique_id (str) – a unique identifier of the data

Returns

second in float represents elapsed time to import data

import_file(format, file, unique_id=None)[source]

Import data into Treasure Data Service, from an existing file on filesystem.

This method will decompress/deserialize records from given file, and then convert it into format acceptable from Treasure Data Service (“msgpack.gz”).

Parameters
  • file (str or file-like) – a name of a file, or a file-like object contains the data

  • unique_id (str) – a unique identifier of the data

Returns

float represents the elapsed time to import data

tail(count, to=None, _from=None)[source]
Parameters
  • count (int) – Number for record to show up from the end.

  • to – Deprecated parameter.

  • _from – Deprecated parameter.

Returns

the contents of the table in reverse order based on the registered time (last data first).

property count

total number of the table

Type

int

property created_at

Created datetime

Type

datetime.datetime

property database_name

a string represents the name of the database

property db_name

a string represents the name of the database

property estimated_storage_size

estimated storage size

property estimated_storage_size_string

a string represents estimated size of the table in human-readable format

property expire_days

an int represents the days until expiration

property identifier

a string identifier of the table

property last_import

datetime.datetime

property last_log_timestamp

datetime.datetime

property name

a string represents the name of the table

property permission

permission for the database (e.g. “administrator”, “full_access”, etc.)

Type

str

property primary_key

add docstring

Type

TODO

property primary_key_type

add docstring

Type

TODO

property schema

str, alias:str]]: The list of a schema

Type

[[column_name

Type

str, column_type

property table_name

a string represents the name of the table

property type

a string represents the type of the table

property updated_at

Updated datetime

Type

datetime.datetime