API

tdclient.api.API class is an internal class represents API.

tdclient.api

class tdclient.api.API(apikey: str | None = None, user_agent: str | None = None, endpoint: str | None = None, headers: dict[str, str] | None = None, retry_post_requests: bool = False, max_cumul_retry_delay: int = 600, http_proxy: str | None = None, **kwargs: Any)[source]

Bases: BulkImportAPI, ConnectorAPI, DatabaseAPI, ExportAPI, ImportAPI, JobAPI, ResultAPI, ScheduleAPI, ServerStatusAPI, TableAPI, UserAPI

Internal API class

Parameters:
  • apikey (str) – the API key of Treasure Data Service. If None is given, TD_API_KEY will be used if available.

  • user_agent (str) – custom User-Agent.

  • endpoint (str) – custom endpoint URL. If None is given, TD_API_SERVER will be used if available.

  • headers (dict) – custom HTTP headers.

  • retry_post_requests (bool) – Specify whether allowing API client to retry POST requests. False by default.

  • max_cumul_retry_delay (int) – maximum retry limit in seconds. 600 seconds by default.

  • http_proxy (str) – HTTP proxy setting. if None is given, HTTP_PROXY will be used if available.

build_request(path: str | None = None, headers: dict[str, str] | None = None, endpoint: str | None = None) tuple[str, dict[str, str]][source]
checked_json(body: bytes, required: list[str]) dict[str, Any][source]
close() None[source]
delete(path: str, params: dict[str, Any] | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
get(path: str, params: dict[str, Any] | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
post(path: str, params: dict[str, Any] | bytes | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
put(path: str, bytes_or_stream: bytes | bytearray | IO[bytes], size: int, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
raise_error(msg: str, res: BaseHTTPResponse, body: bytes | str) None[source]
send_request(method: str, url: str, fields: dict[str, Any] | None = None, body: bytes | bytearray | memoryview | array[int] | IO[bytes] | None = None, headers: dict[str, str] | None = None, **kwargs: Any) BaseHTTPResponse[source]
DEFAULT_ENDPOINT = 'https://api.treasuredata.com/'
DEFAULT_IMPORT_ENDPOINT = 'https://api-import.treasuredata.com/'
property apikey: str | None
property endpoint: str

tdclient.bulk_import_api

class tdclient.bulk_import_api.BulkImportAPI[source]

Bases: object

Enable bulk importing of data to the targeted database and table.

This class is inherited by tdclient.api.API.

static validate_part_name(part_name: str) None[source]

Make sure the part_name is valid

Parameters:

part_name (str) – The part name the user is trying to use

bulk_import_delete_part(name: str, part_name: str, params: dict[str, Any] | None = None) bool[source]

Delete the imported information with the specified name.

Parameters:
  • name (str) – Bulk import name.

  • part_name (str) – Bulk import part name.

  • params (dict, optional) – Extra parameters.

Returns:

True if succeeded.

bulk_import_error_records(name: str, params: dict[str, Any] | None = None) Iterator[dict[str, Any]][source]

List the records that have errors under the specified bulk import name.

Parameters:
  • name (str) – Bulk import name.

  • params (dict, optional) – Extra parameters.

Yields:

Row of the data

bulk_import_upload_file(name: str, part_name: str, format: Literal['msgpack', 'msgpack.gz', 'json', 'json.gz', 'csv', 'csv.gz', 'tsv', 'tsv.gz'], file: str | bytes | IO[bytes], **kwargs: Any) None[source]

Upload a file with bulk import having the specified name.

Parameters:
  • name (str) – Bulk import name.

  • part_name (str) – Bulk import part name.

  • format (str) – Format name. {msgpack, json, csv, tsv}

  • file (str or file-like) – the name of a file, or a file-like object, containing the data

  • **kwargs – Extra arguments.

There is more documentation on format, file and **kwargs at file import parameters.

In particular, for “csv” and “tsv” data, you can change how data columns are parsed using the dtypes and converters arguments.

  • dtypes is a dictionary used to specify a datatype for individual columns, for instance {"col1": "int"}. The available datatypes are "bool", "float", "int", "str" and "guess". If a column is also mentioned in converters, then the function will be used, NOT the datatype.

  • converters is a dictionary used to specify a function that will be used to parse individual columns, for instance {"col1", int}.

The default behaviour is "guess", which makes a best-effort to decide the column datatype. See file import parameters for more details.

bulk_import_upload_part(name: str, part_name: str, stream: bytes | bytearray | IO[bytes], size: int) None[source]

Upload bulk import having the specified name and part in the path.

Parameters:
  • name (str) – Bulk import name.

  • part_name (str) – Bulk import part name.

  • stream (str or file-like) – Byte string or file-like object contains the data

  • size (int) – The length of the data.

checked_json(body: bytes, required: list[str]) dict[str, Any][source]
commit_bulk_import(name: str, params: dict[str, Any] | None = None) bool[source]

Commit the bulk import information having the specified name.

Parameters:
  • name (str) – Bulk import name.

  • params (dict, optional) – Extra parameters.

Returns:

True if succeeded.

create_bulk_import(name: str, db: str, table: str, params: BulkImportParams | None = None) bool[source]

Enable bulk importing of data to the targeted database and table and stores it in the default resource pool. Default expiration for bulk import is 30days.

Parameters:
  • name (str) – Name of the bulk import.

  • db (str) – Name of target database.

  • table (str) – Name of target table.

  • params (dict, optional) – Extra parameters.

Returns:

True if succeeded

delete_bulk_import(name: str, params: dict[str, Any] | None = None) bool[source]

Delete the imported information with the specified name

Parameters:
  • name (str) – Name of bulk import.

  • params (dict, optional) – Extra parameters.

Returns:

True if succeeded

freeze_bulk_import(name: str, params: dict[str, Any] | None = None) bool[source]

Freeze the bulk import with the specified name.

Parameters:
  • name (str) – Bulk import name.

  • params (dict, optional) – Extra parameters.

Returns:

True if succeeded.

get(path: str, params: dict[str, Any] | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
list_bulk_import_parts(name: str, params: dict[str, Any] | None = None) list[str][source]

Return the list of available parts uploaded through bulk_import_upload_part().

Parameters:
  • name (str) – Name of bulk import.

  • params (dict, optional) – Extra parameters.

Returns:

The list of bulk import part name.

Return type:

[str]

list_bulk_imports(params: dict[str, Any] | None = None) list[dict[str, Any]][source]

Return the list of available bulk imports :param params: Extra parameters. :type params: dict, optional

Returns:

The list of available bulk import details.

Return type:

[dict]

perform_bulk_import(name: str, params: dict[str, Any] | None = None) str[source]

Execute a job to perform bulk import with the indicated priority using the resource pool if indicated, else it will use the account’s default.

Parameters:
  • name (str) – Bulk import name.

  • params (dict, optional) – Extra parameters.

Returns:

Job ID

Return type:

str

post(path: str, params: dict[str, Any] | bytes | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
put(path: str, bytes_or_stream: bytes | bytearray | IO[bytes], size: int, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
raise_error(msg: str, res: BaseHTTPResponse, body: bytes | str) None[source]
show_bulk_import(name: str) dict[str, Any][source]

Show the details of the bulk import with the specified name

Parameters:

name (str) – Name of bulk import.

Returns:

Detailed information of the bulk import.

Return type:

dict

unfreeze_bulk_import(name: str, params: dict[str, Any] | None = None) bool[source]

Unfreeze bulk_import with the specified name.

Parameters:
  • name (str) – Bulk import name.

  • params (dict, optional) – Extra parameters.

Returns:

True if succeeded.

tdclient.connector_api

class tdclient.connector_api.ConnectorAPI[source]

Bases: object

Access Data Connector API which handles Data Connector.

This class is inherited by tdclient.api.API.

checked_json(body: bytes, required: list[str]) dict[str, Any][source]
connector_create(name: str, database: str, table: str, job: dict[str, Any], params: dict[str, Any] | None = None) dict[str, Any][source]

Create a Data Connector session.

Parameters:
  • name (str) – name of the connector job

  • database (str) – name of the database to perform connector job

  • table (str) – name of the table to perform connector job

  • job (dict) – dict representation of load.yml

  • params (dict, optional) –

    Extra parameters

Returns:

dict

connector_delete(name: str) dict[str, Any][source]

Delete a Data Connector session.

Parameters:

name (str) – name of the connector job

Returns:

dict

connector_guess(job: dict[str, Any] | bytes) dict[str, Any][source]

Guess the Data Connector configuration

Parameters:

job (dict) – dict representation of seed.yml See Also: https://www.embulk.org/docs/built-in.html#guess-executor

Returns:

The configuration of the Data Connector.

Return type:

dict

Examples

>>> config = {
...     "in": {
...         "type": "s3",
...         "bucket": "your-bucket",
...         "path_prefix": "logs/csv-",
...         "access_key_id": "YOUR-AWS-ACCESS-KEY",
...         "secret_access_key": "YOUR-AWS-SECRET-KEY"
...     },
...     "out": {"mode": "append"},
...     "exec": {"guess_plugins": ["json", "query_string"]},
... }
>>> td.api.connector_guess(config)
{'config': {'in': {'type': 's3',
   'bucket': 'your-bucket',
   'path_prefix': 'logs/csv-',
   'access_key_id': 'YOUR-AWS-ACCESS-KEY',
   'secret_access_key': 'YOU-AWS-SECRET-KEY',
   'parser': {'charset': 'UTF-8',
    'newline': 'LF',
    'type': 'csv',
    'delimiter': ',',
    'quote': '"',
    'escape': '"',
    'trim_if_not_quoted': False,
    'skip_header_lines': 1,
    'allow_extra_columns': False,
    'allow_optional_columns': False,
    'columns': [{'name': 'sepal.length', 'type': 'double'},
     {'name': 'sepal.width', 'type': 'double'},
     {'name': 'petal.length', 'type': 'double'},
     {'name': 'petal.width', 'type': 'string'},
     {'name': 'variety', 'type': 'string'}]}},
  'out': {'mode': 'append'},
  'exec': {'guess_plugin': ['json', 'query_string']},
  'filters': [{'rules': [{'rule': 'upper_to_lower'},
     {'pass_types': ['a-z', '0-9'],
      'pass_characters': '_',
      'replace': '_',
      'rule': 'character_types'},
     {'pass_types': ['a-z'],
      'pass_characters': '_',
      'prefix': '_',
      'rule': 'first_character_types'},
     {'rule': 'unique_number_suffix', 'max_length': 128}],
    'type': 'rename'},
   {'from_value': {'mode': 'upload_time'},
    'to_column': {'name': 'time'},
    'type': 'add_time'}]}}
connector_history(name: str) list[dict[str, Any]][source]

Show the list of the executed jobs information for the Data Connector job.

Parameters:

name (str) – name of the connector job

Returns:

list

connector_issue(db: str, table: str, job: dict[str, Any]) str[source]

Create a Data Connector job.

Parameters:
  • db (str) – name of the database to perform connector job

  • table (str) – name of the table to perform connector job

  • job (dict) – dict representation of load.yml

Returns:

job Id

Return type:

str

connector_list() list[dict[str, Any]][source]

Show the list of available Data Connector sessions.

Returns:

list

connector_preview(job: dict[str, Any]) dict[str, Any][source]

Show the preview of the Data Connector job.

Parameters:

job (dict) – dict representation of load.yml

Returns:

dict

connector_run(name: str, **kwargs: Any) dict[str, Any][source]

Create a job to execute Data Connector session.

Parameters:
  • name (str) – name of the connector job

  • **kwargs (optional) –

    Extra parameters.

    • scheduled_time (int):

      Time in Unix epoch format that would be set as TD_SCHEDULED_TIME.

    • domain_key (str):

      Job domain key which is assigned to a single job.

Returns:

dict

connector_show(name: str) dict[str, Any][source]

Show a specific Data Connector session information.

Parameters:

name (str) – name of the connector job

Returns:

dict

connector_update(name: str, job: dict[str, Any]) dict[str, Any][source]

Update a specific Data Connector session.

Parameters:
Returns:

dict

delete(path: str, params: dict[str, Any] | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
get(path: str, params: dict[str, Any] | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
post(path: str, params: dict[str, Any] | bytes | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
put(path: str, bytes_or_stream: bytes | bytearray | IO[bytes], size: int, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
raise_error(msg: str, res: BaseHTTPResponse, body: bytes) None[source]

tdclient.database_api

class tdclient.database_api.DatabaseAPI[source]

Bases: object

Access to Database of Treasure Data Service.

This class is inherited by tdclient.api.API.

checked_json(body: bytes, required: list[str]) dict[str, Any][source]
create_database(db: str, params: dict[str, Any] | None = None) bool[source]

Create a new database with the given name.

Parameters:
  • db (str) – Target database name.

  • params (dict) – Extra parameters.

Returns:

True if succeeded.

Return type:

bool

delete_database(db: str) bool[source]

Delete a database.

Parameters:

db (str) – Target database name.

Returns:

True if succeeded.

Return type:

bool

get(path: str, params: dict[str, Any] | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
list_databases() dict[str, Any][source]

Get the list of all the databases of the account.

Returns:

Detailed database information. Each key of the dict is database name.

Return type:

dict

post(path: str, params: dict[str, Any] | bytes | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
raise_error(msg: str, res: BaseHTTPResponse, body: bytes) None[source]

tdclient.export_api

class tdclient.export_api.ExportAPI[source]

Bases: object

Access to Export API.

This class is inherited by tdclient.api.API.

checked_json(body: bytes, required: list[str]) dict[str, Any][source]
export_data(db: str, table: str, storage_type: str, params: ExportParams | None = None) str[source]

Creates a job to export the contents from the specified database and table names.

Parameters:
  • db (str) – Target database name.

  • table (str) – Target table name.

  • storage_type (str) – Name of storage type. e.g. “s3”

  • params (dict) –

    Extra parameters. Assuming the following keys:

    • access_key_id (str):

      ID to access the information to be exported.

    • secret_access_key (str):

      Password for the access_key_id.

    • file_prefix (str, optional):

      Filename of exported file. Default: “<database_name>/<table_name>”

    • file_format (str, optional):

      File format of the information to be exported. {“jsonl.gz”, “tsv.gz”, “json.gz”}

    • from (int, optional):

      From Time of the data to be exported in Unix epoch format.

    • to (int, optional):

      End Time of the data to be exported in Unix epoch format.

    • assume_role (str, optional):

      Assume role.

    • bucket (str):

      Name of bucket to be used.

    • domain_key (str, optional):

      Job domain key.

    • pool_name (str, optional):

      For Presto only. Pool name to be used, if not specified, default pool would be used.

Returns:

Job ID.

Return type:

str

post(path: str, params: dict[str, Any] | bytes | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
raise_error(msg: str, res: BaseHTTPResponse, body: bytes | str) None[source]

tdclient.import_api

class tdclient.import_api.ImportAPI[source]

Bases: object

Import data into Treasure Data Service.

This class is inherited by tdclient.api.API.

checked_json(body: bytes, required: list[str]) dict[str, Any][source]
import_data(db: str, table: str, format: Literal['msgpack', 'msgpack.gz', 'json', 'json.gz', 'csv', 'csv.gz', 'tsv', 'tsv.gz'], bytes_or_stream: bytes | bytearray | IO[bytes], size: int, unique_id: str | None = None) float[source]

Import data into Treasure Data Service

This method expects data from a file-like object formatted with “msgpack.gz”.

Parameters:
  • db (str) – name of a database

  • table (str) – name of a table

  • format (str) – format of data type (e.g. “msgpack.gz”)

  • bytes_or_stream (str or file-like) – a byte string or a file-like object contains the data

  • size (int) – the length of the data

  • unique_id (str) – a unique identifier of the data

Returns:

float represents the elapsed time to import data

import_file(db: str, table: str, format: Literal['msgpack', 'msgpack.gz', 'json', 'json.gz', 'csv', 'csv.gz', 'tsv', 'tsv.gz'], file: str | bytes | IO[bytes], unique_id: str | None = None, **kwargs: Any) float[source]

Import data into Treasure Data Service, from an existing file on filesystem.

This method will decompress/deserialize records from given file, and then convert it into format acceptable from Treasure Data Service (“msgpack.gz”). This method is a wrapper function to import_data.

Parameters:
  • db (str) – name of a database

  • table (str) – name of a table

  • format (str) – format of data type (e.g. “msgpack”, “json”)

  • file (str or file-like) – a name of a file, or a file-like object contains the data

  • unique_id (str) – a unique identifier of the data

Returns:

float represents the elapsed time to import data

put(path: str, bytes_or_stream: bytes | bytearray | IO[bytes], size: int, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
raise_error(msg: str, res: BaseHTTPResponse, body: bytes) None[source]

tdclient.job_api

class tdclient.job_api.JobAPI[source]

Bases: object

Access to Job API

This class is inherited by tdclient.api.API.

checked_json(body: bytes, required: list[str]) dict[str, Any][source]
download_job_result(job_id: str, path: str, num_threads: int = 4) bool[source]

Download the job result to the specified path.

Parameters:
  • job_id (int) – Job ID

  • path (str) – Path to save the job result

  • num_threads (int) – Number of threads to download the job result. Default is 4.

get(path: str, params: dict[str, Any] | None = None, headers: dict[str, str] | None = None) AbstractContextManager[BaseHTTPResponse][source]
job_result(job_id: str) list[dict[str, Any]][source]

Return the job result.

Parameters:

job_id (int) – Job ID

Returns:

Job result in list

job_result_each(job_id: str) Iterator[dict[str, Any]][source]

Yield a row of the job result.

Parameters:

job_id (int) – Job ID

Yields:

Row in a result

job_result_format(job_id: str, format: str, header: bool = False) list[dict[str, Any]][source]

Return the job result with specified format.

Parameters:
  • job_id (int) – Job ID

  • format (str) – Output format of the job result information. “json” or “msgpack”

  • header (boolean) – Includes Header or not. False or True

Returns:

The query result of the specified job in.

job_result_format_each(job_id: str, format: str, header: bool = False, store_tmpfile: bool = False, num_threads: int = 4) Iterator[dict[str, Any]][source]

Yield a row of the job result with specified format.

Parameters:
  • job_id (int) – job ID

  • format (str) – Output format of the job result information. “json” or “msgpack”

  • header (bool) – Include Header info or not “True” or “False”

  • store_tmpfile (bool) – Download job result as a temporary file or not. Default is False. It works only when format is “msgpack”. “True” or “False”

  • num_threads (int) – Number of threads to download the job result when store_tmpfile is True. Default is 4.

Yields:

The query result of the specified job in.

job_status(job_id: str) str[source]

Show job status :param job_id: job ID :type job_id: str

Returns:

The status information of the given job id at last execution.

kill(job_id: str) str | None[source]

Stop the specific job if it is running.

Parameters:

job_id (str) – Job Id to kill

Returns:

Job status before killing

list_jobs(_from: int = 0, to: int | None = None, status: str | None = None, conditions: dict[str, Any] | None = None) list[dict[str, Any]][source]

Show the list of Jobs.

Parameters:
  • _from (int) – Gets the Job from the nth index in the list. Default: 0

  • to (int, optional) – Gets the Job up to the nth index in the list. By default, the first 20 jobs in the list are displayed

  • status (str, optional) – Filter by given status. {“queued”, “running”, “success”, “error”}

  • conditions (dict[str, Any], optional) – Condition for TIMESTAMPDIFF() to search for slow queries. Avoid using this parameter as it can be dangerous.

Returns:

a list of dict which represents a job

post(path: str, params: dict[str, Any] | bytes | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
query(q: str, type: Literal['hive', 'presto', 'trino', 'bulkload'] = 'hive', db: str | None = None, result_url: str | None = None, priority: Literal[-2, -1, 0, 1, 2, 'VERY LOW', 'LOW', 'NORMAL', 'HIGH', 'VERY HIGH'] | None = None, retry_limit: int | None = None, **kwargs: Any) str[source]

Create a job for given query.

Parameters:
  • q (str) – Query string.

  • type (str) – Query type. hive, presto, trino, bulkload. Default: hive

  • db (str) – Database name.

  • result_url (str) – Result output URL. e.g., postgresql://<username>:<password>@<hostname>:<port>/<database>/<table>

  • priority (int or str) – Job priority. In str, “Normal”, “Very low”, “Low”, “High”, “Very high”. In int, the number in the range of -2 to 2.

  • retry_limit (int) – Automatic retry count.

  • **kwargs – Extra options.

Returns:

Job ID issued for the query

Return type:

str

raise_error(msg: str, res: BaseHTTPResponse, body: bytes | str) None[source]
show_job(job_id: str) dict[str, Any][source]

Return detailed information of a Job.

Parameters:

job_id (str) – job ID

Returns:

Detailed information of a job

Return type:

dict

JOB_PRIORITY: dict[str, int] = {'HIGH': 1, 'LOW': -1, 'NORM': 0, 'NORMAL': 0, 'VERY HIGH': 2, 'VERY LOW': -2, 'VERY-HIGH': 2, 'VERY-LOW': -2, 'VERY_HIGH': 2, 'VERY_LOW': -2}

tdclient.result_api

class tdclient.result_api.ResultAPI[source]

Bases: object

Access to Result API.

This class is inherited by tdclient.api.API.

checked_json(body: bytes, required: list[str]) dict[str, Any][source]
create_result(name: str, url: str, params: ResultParams | None = None) bool[source]

Create a new authentication with the specified name.

Parameters:
  • name (str) – Authentication name.

  • url (str) – Url of the authentication to be created. e.g. “ftp://test.com/

  • params (dict, optional) – Extra parameters.

Returns:

True if succeeded.

Return type:

bool

delete_result(name: str) bool[source]

Delete the authentication having the specified name.

Parameters:

name (str) – Authentication name.

Returns:

True if succeeded.

Return type:

bool

get(path: str, params: dict[str, Any] | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
list_result() list[tuple[str, str, None]][source]

Get the list of all the available authentications.

Returns:

The list of tuple of name, Result output url, and

organization name (always None for api compatibility).

Return type:

[(str, str, None)]

post(path: str, params: dict[str, Any] | bytes | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
raise_error(msg: str, res: BaseHTTPResponse, body: bytes) None[source]

tdclient.schedule_api

class tdclient.schedule_api.ScheduleAPI[source]

Bases: object

Access to Schedule API

This class is inherited by tdclient.api.API.

checked_json(body: bytes, required: list[str]) dict[str, Any][source]
create_schedule(name: str, params: ScheduleParams | None = None) datetime | None[source]

Create a new scheduled query with the specified name.

Parameters:
  • name (str) – Scheduled query name.

  • params (dict, optional) –

    Extra parameters.

    • type (str):

      Query type. {“presto”, “hive”}. Default: “hive”

    • database (str):

      Target database name.

    • timezone (str):

      Scheduled query’s timezone. e.g. “UTC” For details, see also: https://gist.github.com/frsyuki/4533752

    • cron (str, optional):

      Schedule of the query. {"@daily", "@hourly", "10 * * * *" (custom cron)} See also: https://docs.treasuredata.com/articles/#!pd/Scheduling-Jobs-Using-TD-Console

    • delay (int, optional):

      A delay ensures all buffered events are imported before running the query. Default: 0

    • query (str):

      Is a language used to retrieve, insert, update and modify data. See also: https://docs.treasuredata.com/articles/#!pd/SQL-Examples-of-Scheduled-Queries

    • priority (int, optional):

      Priority of the query. Range is from -2 (very low) to 2 (very high). Default: 0

    • retry_limit (int, optional):

      Automatic retry count. Default: 0

    • engine_version (str, optional):

      Engine version to be used. If none is specified, the account’s default engine version would be set. {“stable”, “experimental”}

    • pool_name (str, optional):

      For Presto only. Pool name to be used, if not specified, default pool would be used.

    • result (str, optional):

      Location where to store the result of the query. e.g. ‘tableau://user:password@host.com:1234/datasource’

Returns:

Start date time.

Return type:

datetime.datetime

delete_schedule(name: str) tuple[str, str][source]

Delete the scheduled query with the specified name.

Parameters:

name (str) – Target scheduled query name.

Returns:

Tuple of cron and query.

Return type:

(str, str)

get(path: str, params: dict[str, Any] | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
history(name: str, _from: int = 0, to: int | None = None) list[tuple[Any, ...]][source]

Get the history details of the saved query for the past 90days.

Parameters:
  • name (str) – Target name of the scheduled query.

  • _from (int, optional) – Indicates from which nth record in the run history would be fetched. Default: 0. Note: Count starts from zero. This means that the first record in the list has a count of zero.

  • to (int, optional) – Indicates up to which nth record in the run history would be fetched. Default: 20

Returns:

History of the scheduled query.

Return type:

dict

list_schedules() list[dict[str, Any]][source]

Get the list of all the scheduled queries.

Returns:

str, cron:str, query:str, database:str, result_url:str)]

Return type:

[(name

post(path: str, params: dict[str, Any] | bytes | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
raise_error(msg: str, res: BaseHTTPResponse, body: bytes) None[source]
run_schedule(name: str, time: int, num: int | None = None) list[tuple[Any, Any, datetime | None]][source]

Execute the specified query.

Parameters:
  • name (str) – Target scheduled query name.

  • time (int) – Time in Unix epoch format that would be set as TD_SCHEDULED_TIME

  • num (int, optional) – Indicates how many times the query will be executed. Value should be 9 or less.

Returns:

[(job_id:int, type:str, scheduled_at:str)]

Return type:

list of tuple

update_schedule(name: str, params: ScheduleParams | None = None) datetime | None[source]

Update the scheduled query.

Parameters:
  • name (str) – Target scheduled query name.

  • params (ScheduleParams | None) –

    Extra parameters.

    • type (str):

      Query type. {“presto”, “hive”}. Default: “hive”

    • database (str):

      Target database name.

    • timezone (str):

      Scheduled query’s timezone. e.g. “UTC” For details, see also: https://gist.github.com/frsyuki/4533752

    • cron (str, optional):

      Schedule of the query. {"@daily", "@hourly", "10 * * * *" (custom cron)} See also: https://docs.treasuredata.com/articles/#!pd/Scheduling-Jobs-Using-TD-Console

    • delay (int, optional):

      A delay ensures all buffered events are imported before running the query. Default: 0

    • query (str):

      Is a language used to retrieve, insert, update and modify data. See also: https://docs.treasuredata.com/articles/#!pd/SQL-Examples-of-Scheduled-Queries

    • priority (int, optional):

      Priority of the query. Range is from -2 (very low) to 2 (very high). Default: 0

    • retry_limit (int, optional):

      Automatic retry count. Default: 0

    • engine_version (str, optional):

      Engine version to be used. If none is specified, the account’s default engine version would be set. {“stable”, “experimental”}

    • pool_name (str, optional):

      For Presto only. Pool name to be used, if not specified, default pool would be used.

    • result (str, optional):

      Location where to store the result of the query. e.g. ‘tableau://user:password@host.com:1234/datasource’

tdclient.schedule_api.history_to_tuple(m: dict[str, Any]) tuple[datetime | None, Any, str, Any, Any, datetime | None, datetime | None, Any, Any, Any][source]
tdclient.schedule_api.job_to_tuple(m: dict[str, Any]) tuple[str | None, str, datetime | None][source]
tdclient.schedule_api.schedule_to_tuple(m: dict[str, Any]) dict[str, Any][source]

tdclient.server_status_api

class tdclient.server_status_api.ServerStatusAPI[source]

Bases: object

Access to Server Status API

This class is inherited by tdclient.api.API.

checked_json(body: bytes, required: list[str]) dict[str, Any][source]
get(path: str, params: dict[str, Any] | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
server_status() str[source]

Show the status of Treasure Data

Returns:

status

Return type:

str

tdclient.table_api

class tdclient.table_api.TableAPI[source]

Bases: object

Access to Table API

This class is inherited by tdclient.api.API.

change_database(db: str, table: str, dest_db: str) bool[source]

Move a target table from it’s original database to new destination database.

Parameters:
  • db (str) – Target database name.

  • table (str) – Target table name.

  • dest_db (str) – Destination database name.

Returns:

True if succeeded

Return type:

bool

checked_json(body: bytes, required: list[str]) dict[str, Any][source]
create_log_table(db: str, table: str) bool[source]

Create a new table in the database and registers it in PlazmaDB.

Parameters:
  • db (str) – Target database name.

  • table (str) – Target table name.

Returns:

True if succeeded.

Return type:

bool

delete_table(db: str, table: str) str[source]

Delete the specified table.

Parameters:
  • db (str) – Target database name.

  • table (str) – Target table name.

Returns:

Type information of the table (e.g. “log”).

Return type:

str

get(path: str, params: dict[str, Any] | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
list_tables(db: str) dict[str, Any][source]

Gets the list of table in the database.

Parameters:

db (str) – Target database name.

Returns:

Detailed table information.

Return type:

dict

Examples

>>> td.api.list_tables("my_db")
{ 'iris': {'id': 21039862,
  'name': 'iris',
  'estimated_storage_size': 1236,
  'counter_updated_at': '2019-09-18T07:14:28Z',
  'last_log_timestamp': datetime.datetime(2019, 1, 30, 5, 34, 42, tzinfo=tzutc()),
  'delete_protected': False,
  'created_at': datetime.datetime(2019, 1, 30, 5, 34, 42, tzinfo=tzutc()),
  'updated_at': datetime.datetime(2019, 1, 30, 5, 34, 46, tzinfo=tzutc()),
  'type': 'log',
  'count': 150,
  'schema': [['sepal_length', 'double', 'sepal_length'],
   ['sepal_width', 'double', 'sepal_width'],
   ['petal_length', 'double', 'petal_length'],
   ['petal_width', 'double', 'petal_width'],
   ['species', 'string', 'species']],
  'expire_days': None,
  'last_import': datetime.datetime(2019, 9, 18, 7, 14, 28, tzinfo=tzutc())},
}
post(path: str, params: dict[str, Any] | bytes | None = None, headers: dict[str, str] | None = None, **kwargs: Any) AbstractContextManager[BaseHTTPResponse][source]
raise_error(msg: str, res: BaseHTTPResponse, body: bytes | str) None[source]
swap_table(db: str, table1: str, table2: str) bool[source]

Swap the two specified tables with each other belonging to the same database and basically exchanges their names.

Parameters:
  • db (str) – Target database name

  • table1 (str) – First target table for the swap.

  • table2 (str) – Second target table for the swap.

Returns:

True if succeeded.

Return type:

bool

tail(db: str, table: str, count: int, to: Any = None, _from: Any = None, block: Any = None) list[dict[str, Any]][source]

Get the contents of the table in reverse order based on the registered time (last data first).

Parameters:
  • db (str) – Target database name.

  • table (str) – Target table name.

  • count (int) – Number for record to show up from the end.

  • to – Deprecated parameter.

  • _from – Deprecated parameter.

  • block – Deprecated parameter.

Returns:

Contents of the table.

Return type:

[dict]

update_expire(db: str, table: str, expire_days: int) bool[source]

Update the expire days for the specified table

Parameters:
  • db (str) – Target database name.

  • table (str) – Target table name.

  • expire_days (int) – Number of days where the contents of the specified table would expire.

Returns:

True if succeeded.

Return type:

bool

update_schema(db: str, table: str, schema_json: str) bool[source]

Update the table schema.

Parameters:
  • db (str) – Target database name.

  • table (str) – Target table name.

  • schema_json (str) – Schema format JSON string. See also: ~`Client.update_schema` e.g. ‘[[“sep_len”, “long”, “sep_len”], [“sep_wid”, “long”, “sep_wid”]]’

Returns:

True if succeeded.

Return type:

bool