Client

tdclient.client.Client class is a public interface for tdclient. It provides methods for executions for REST API.

tdclient.client

class tdclient.client.Client(*args, **kwargs)[source]

Bases: object

API Client for Treasure Data Service

add_apikey(name)[source]
Parameters

name (str) – name of the user

Returns

True if success

add_user(name, org, email, password)[source]

Add a new user

Parameters
  • name (str) – name of the user

  • org (str) – organization

  • email – (str): e-mail address

  • password (str) – password

Returns

True if success

bulk_import(name)[source]

Get a bulk import session

Parameters

name (str) – name of a bulk import session

Returns

tdclient.models.BulkImport

bulk_import_delete_part(name, part_name)[source]

Delete a part from a bulk import session

Parameters
  • name (str) – name of a bulk import session

  • part_name (str) – name of a part of the bulk import session

Returns

True if success

bulk_import_error_records(name)[source]
Parameters

name (str) – name of a bulk import session

Returns

an iterator of error records

bulk_import_upload_file(name, part_name, format, file)[source]

Upload a part to Bulk Import session, from an existing file on filesystem.

Parameters
  • name (str) – name of a bulk import session

  • part_name (str) – name of a part of the bulk import session

  • format (str) – format of data type (e.g. “msgpack”, “json”)

  • file (str or file-like) – a name of a file, or a file-like object contains the data

bulk_import_upload_part(name, part_name, bytes_or_stream, size)[source]

Upload a part to a bulk import session

Parameters
  • name (str) – name of a bulk import session

  • part_name (str) – name of a part of the bulk import session

  • bytes_or_stream (file-like) – a file-like object contains the part

  • size (int) – the size of the part

bulk_imports()[source]

List bulk import sessions

Returns

a list of tdclient.models.BulkImport

change_database(db_name, table_name, new_db_name)[source]

Move a target table from it’s original database to new destination database.

Parameters
  • db_name (str) – Target database name.

  • table_name (str) – Target table name.

  • new_db_name (str) – Destination database name to be moved.

Returns

True if succeeded.

Return type

bool

close()[source]

Close opened API connections.

commit_bulk_import(name)[source]

Commit a bulk import session

Parameters

name (str) – name of a bulk import session

Returns

True if success

create_bulk_import(name, database, table, params=None)[source]

Create new bulk import session

Parameters
  • name (str) – name of new bulk import session

  • database (str) – name of a database

  • table (str) – name of a table

Returns

tdclient.models.BulkImport

create_database(db_name, **kwargs)[source]
Parameters

db_name (str) – name of a database to create

Returns

True if success

create_log_table(db_name, table_name)[source]
Parameters
  • db_name (str) – name of a database

  • table_name (str) – name of a table to create

Returns

True if success

create_result(name, url, params=None)[source]

Create a new authentication with the specified name.

Parameters
  • name (str) – Authentication name.

  • url (str) – Url of the authentication to be created. e.g. “ftp://test.com/

  • params (dict, optional) – Extra parameters.

Returns

True if succeeded.

Return type

bool

create_schedule(name, params=None)[source]

Create a new scheduled query with the specified name.

Parameters
  • name (str) – Scheduled query name.

  • params (dict, optional) –

    Extra parameters.

    • type (str):

      Query type. {“presto”, “hive”}. Default: “hive”

    • database (str):

      Target database name.

    • timezone (str):

      Scheduled query’s timezone. e.g. “UTC” For details, see also: https://gist.github.com/frsyuki/4533752

    • cron (str, optional):

      Schedule of the query. {"@daily", "@hourly", "10 * * * *" (custom cron)} See also: https://support.treasuredata.com/hc/en-us/articles/360001451088-Scheduled-Jobs-Web-Console

    • delay (int, optional):

      A delay ensures all buffered events are imported before running the query. Default: 0

    • query (str):

      Is a language used to retrieve, insert, update and modify data. See also: https://support.treasuredata.com/hc/en-us/articles/360012069493-SQL-Examples-of-Scheduled-Queries

    • priority (int, optional):

      Priority of the query. Range is from -2 (very low) to 2 (very high). Default: 0

    • retry_limit (int, optional):

      Automatic retry count. Default: 0

    • engine_version (str, optional):

      Engine version to be used. If none is specified, the account’s default engine version would be set. {“stable”, “experimental”}

    • pool_name (str, optional):

      For Presto only. Pool name to be used, if not specified, default pool would be used.

    • result (str, optional):

      Location where to store the result of the query. e.g. ‘tableau://user:password@host.com:1234/datasource’

Returns

Start date time.

Return type

datetime.datetime

database(db_name)[source]
Parameters

db_name (str) – name of a database

Returns

tdclient.models.Database

databases()[source]
Returns

a list of tdclient.models.Database

delete_bulk_import(name)[source]

Delete a bulk import session

Parameters

name (str) – name of a bulk import session

Returns

True if success

delete_database(db_name)[source]
Parameters

db_name (str) – name of database to delete

Returns

True if success

delete_result(name)[source]

Delete the authentication having the specified name.

Parameters

name (str) – Authentication name.

Returns

True if succeeded.

Return type

bool

delete_schedule(name)[source]

Delete the scheduled query with the specified name.

Parameters

name (str) – Target scheduled query name.

Returns

Tuple of cron and query.

Return type

(str, str)

delete_table(db_name, table_name)[source]

Delete a table

Parameters
  • db_name (str) – name of a database

  • table_name (str) – name of a table

Returns

a string represents the type of deleted table

export_data(db_name, table_name, storage_type, params=None)[source]

Export data from Treasure Data Service

Parameters
  • db_name (str) – name of a database

  • table_name (str) – name of a table

  • storage_type (str) – type of the storage

  • params (dict) –

    optional parameters. Assuming the following keys:

    • access_key_id (str):

      ID to access the information to be exported.

    • secret_access_key (str):

      Password for the access_key_id.

    • file_prefix (str, optional):

      Filename of exported file. Default: “<database_name>/<table_name>”

    • file_format (str, optional):

      File format of the information to be exported. {“jsonl.gz”, “tsv.gz”, “json.gz”}

    • from (int, optional):

      From Time of the data to be exported in Unix epoch format.

    • to (int, optional):

      End Time of the data to be exported in Unix epoch format.

    • assume_role (str, optional): Assume role.

    • bucket (str):

      Name of bucket to be used.

    • domain_key (str, optional):

      Job domain key.

    • pool_name (str, optional):

      For Presto only. Pool name to be used, if not specified, default pool would be used.

Returns

tdclient.models.Job

freeze_bulk_import(name)[source]

Freeze a bulk import session

Parameters

name (str) – name of a bulk import session

Returns

True if success

history(name, _from=None, to=None)[source]

Get the history details of the saved query for the past 90days.

Parameters
  • name (str) – Target name of the scheduled query.

  • _from (int, optional) – Indicates from which nth record in the run history would be fetched. Default: 0. Note: Count starts from zero. This means that the first record in the list has a count of zero.

  • to (int, optional) – Indicates up to which nth record in the run history would be fetched. Default: 20

Returns

[tdclient.models.ScheduledJob]

import_data(db_name, table_name, format, bytes_or_stream, size, unique_id=None)[source]

Import data into Treasure Data Service

Parameters
  • db_name (str) – name of a database

  • table_name (str) – name of a table

  • format (str) – format of data type (e.g. “msgpack.gz”)

  • bytes_or_stream (str or file-like) – a byte string or a file-like object contains the data

  • size (int) – the length of the data

  • unique_id (str) – a unique identifier of the data

Returns

second in float represents elapsed time to import data

import_file(db_name, table_name, format, file, unique_id=None)[source]

Import data into Treasure Data Service, from an existing file on filesystem.

This method will decompress/deserialize records from given file, and then convert it into format acceptable from Treasure Data Service (“msgpack.gz”).

Parameters
  • db_name (str) – name of a database

  • table_name (str) – name of a table

  • format (str) – format of data type (e.g. “msgpack”, “json”)

  • file (str or file-like) – a name of a file, or a file-like object contains the data

  • unique_id (str) – a unique identifier of the data

Returns

float represents the elapsed time to import data

job(job_id)[source]

Get a job from job_id

Parameters

job_id (str) – job id

Returns

tdclient.models.Job

job_result(job_id)[source]
Parameters

job_id (str) – job id

Returns

a list of each rows in result set

job_result_each(job_id)[source]
Parameters

job_id (str) – job id

Returns

an iterator of result set

job_result_format(job_id, format)[source]
Parameters
  • job_id (str) – job id

  • format (str) – output format of result set

Returns

a list of each rows in result set

job_result_format_each(job_id, format)[source]
Parameters
  • job_id (str) – job id

  • format (str) – output format of result set

Returns

an iterator of rows in result set

job_status(job_id)[source]
Parameters

job_id (str) – job id

Returns

a string represents the status of the job (“success”, “error”, “killed”, “queued”, “running”)

jobs(_from=None, to=None, status=None, conditions=None)[source]

List jobs

Parameters
  • _from (int, optional) – Gets the Job from the nth index in the list. Default: 0.

  • to (int, optional) – Gets the Job up to the nth index in the list. By default, the first 20 jobs in the list are displayed

  • status (str, optional) – Filter by given status. {“queued”, “running”, “success”, “error”}

  • conditions (str, optional) – Condition for TIMESTAMPDIFF() to search for slow queries. Avoid using this parameter as it can be dangerous.

Returns

a list of tdclient.models.Job

kill(job_id)[source]
Parameters

job_id (str) – job id

Returns

a string represents the status of killed job (“queued”, “running”)

list_apikeys(name)[source]
Parameters

name (str) – name of the user

Returns

a list of string of API key

list_bulk_import_parts(name)[source]

List parts of a bulk import session

Parameters

name (str) – name of a bulk import session

Returns

a list of string represents the name of parts

partial_delete(db_name, table_name, to, _from, params=None)[source]

Create a job to partially delete the contents of the table with the given time range.

Parameters
  • db_name (str) – Target database name.

  • table_name (str) – Target table name.

  • to (int) – Time in Unix Epoch format indicating the End date and time of the data to be deleted. Should be set only by the hour. Minutes and seconds values will not be accepted.

  • _from (int) – Time in Unix Epoch format indicating the Start date and time of the data to be deleted. Should be set only by the hour. Minutes and seconds values will not be accepted.

  • params (dict, optional) –

    Extra parameters.

    • pool_name (str, optional):

      Indicates the resource pool to execute this job. If not provided, the account’s default resource pool would be used.

    • domain_key (str, optional):

      Domain key that will be assigned to the partial delete job to be created

Returns

tdclient.models.Job

perform_bulk_import(name)[source]

Perform a bulk import session

Parameters

name (str) – name of a bulk import session

Returns

tdclient.models.Job

query(db_name, q, result_url=None, priority=None, retry_limit=None, type='hive', **kwargs)[source]

Run a query on specified database table.

Parameters
  • db_name (str) – name of a database

  • q (str) – a query string

  • result_url (str) – result output URL. e.g., postgresql://<username>:<password>@<hostname>:<port>/<database>/<table>

  • priority (int or str) – priority (e.g. “NORMAL”, “HIGH”, etc.)

  • retry_limit (int) – retry limit

  • type (str) – name of a query engine

Returns

tdclient.models.Job

Raises

ValueError – if unknown query type has been specified

remove_apikey(name, apikey)[source]
Parameters
  • name (str) – name of the user

  • apikey (str) – an API key to remove

Returns

True if success

remove_user(name)[source]

Remove a user

Parameters

name (str) – name of the user

Returns

True if success

results()[source]

Get the list of all the available authentications.

Returns

a list of tdclient.models.Result

run_schedule(name, time, num)[source]

Execute the specified query.

Parameters
  • name (str) – Target scheduled query name.

  • time (int) – Time in Unix epoch format that would be set as TD_SCHEDULED_TIME

  • num (int) – Indicates how many times the query will be executed. Value should be 9 or less.

Returns

[tdclient.models.ScheduledJob]

schedules()[source]

Get the list of all the scheduled queries.

Returns

[tdclient.models.Schedule]

server_status()[source]
Returns

a string represents current server status.

swap_table(db_name, table_name1, table_name2)[source]
Parameters
  • db_name (str) – name of a database

  • table_name1 (str) – original table name

  • table_name2 (str) – table name you want to rename to

Returns

True if success

table(db_name, table_name)[source]
Parameters
  • db_name (str) – name of a database

  • table_name (str) – name of a table

Returns

tdclient.models.Table

Raises

tdclient.api.NotFoundError – if the table doesn’t exist

tables(db_name)[source]

List existing tables

Parameters

db_name (str) – name of a database

Returns

a list of tdclient.models.Table

tail(db_name, table_name, count, to=None, _from=None, block=None)[source]

Get the contents of the table in reverse order based on the registered time (last data first).

Parameters
  • db_name (str) – Target database name.

  • table_name (str) – Target table name.

  • count (int) – Number for record to show up from the end.

  • to – Deprecated parameter.

  • _from – Deprecated parameter.

  • block – Deprecated parameter.

Returns

Contents of the table.

Return type

[dict]

unfreeze_bulk_import(name)[source]

Unfreeze a bulk import session

Parameters

name (str) – name of a bulk import session

Returns

True if success

update_expire(db_name, table_name, expire_days)[source]

Set expiration date to a table

Parameters
  • db_name (str) – name of a database

  • table_name (str) – name of a table

  • epire_days (int) – expiration date in days from today

Returns

True if success

update_schedule(name, params=None)[source]

Update the scheduled query.

Parameters
  • name (str) – Target scheduled query name.

  • params (dict) –

    Extra parameteres.

    • type (str):

      Query type. {“presto”, “hive”}. Default: “hive”

    • database (str):

      Target database name.

    • timezone (str):

      Scheduled query’s timezone. e.g. “UTC” For details, see also: https://gist.github.com/frsyuki/4533752

    • cron (str, optional):

      Schedule of the query. {"@daily", "@hourly", "10 * * * *" (custom cron)} See also: https://support.treasuredata.com/hc/en-us/articles/360001451088-Scheduled-Jobs-Web-Console

    • delay (int, optional):

      A delay ensures all buffered events are imported before running the query. Default: 0

    • query (str):

      Is a language used to retrieve, insert, update and modify data. See also: https://support.treasuredata.com/hc/en-us/articles/360012069493-SQL-Examples-of-Scheduled-Queries

    • priority (int, optional):

      Priority of the query. Range is from -2 (very low) to 2 (very high). Default: 0

    • retry_limit (int, optional):

      Automatic retry count. Default: 0

    • engine_version (str, optional):

      Engine version to be used. If none is specified, the account’s default engine version would be set. {“stable”, “experimental”}

    • pool_name (str, optional):

      For Presto only. Pool name to be used, if not specified, default pool would be used.

    • result (str, optional):

      Location where to store the result of the query. e.g. ‘tableau://user:password@host.com:1234/datasource’

update_schema(db_name, table_name, schema)[source]

Updates the schema of a table

Parameters
  • db_name (str) – name of a database

  • table_name (str) – name of a table

  • schema (list) –

    a dictionary object represents the schema definition (will be converted to JSON) e.g.

    [
        ["member_id", # column name
         "string", # data type
         "mem_id", # alias of the column name
        ],
        ["row_index", "long", "row_ind"],
        ...
    ]
    

Returns

True if success

users()[source]

List users

Returns

a list of tdclient.models.User

property api

an instance of tdclient.api.API

property apikey

API key string.