Client¶
tdclient.client.Client
class is a public interface for tdclient.
It provides methods for executions for REST API.
tdclient.client¶
-
class
tdclient.client.
Client
(*args, **kwargs)[source]¶ Bases:
object
API Client for Treasure Data Service
-
add_user
(name, org, email, password)[source]¶ Add a new user
- Parameters
name (str) – name of the user
org (str) – organization
email – (str): e-mail address
password (str) – password
- Returns
True if success
-
bulk_import
(name)[source]¶ Get a bulk import session
- Parameters
name (str) – name of a bulk import session
- Returns
-
bulk_import_delete_part
(name, part_name)[source]¶ Delete a part from a bulk import session
- Parameters
name (str) – name of a bulk import session
part_name (str) – name of a part of the bulk import session
- Returns
True if success
-
bulk_import_error_records
(name)[source]¶ - Parameters
name (str) – name of a bulk import session
- Returns
an iterator of error records
-
bulk_import_upload_file
(name, part_name, format, file, **kwargs)[source]¶ Upload a part to Bulk Import session, from an existing file on filesystem.
- Parameters
name (str) – name of a bulk import session
part_name (str) – name of a part of the bulk import session
format (str) – format of data type (e.g. “msgpack”, “json”, “csv”, “tsv”)
file (str or file-like) – the name of a file, or a file-like object, containing the data
**kwargs – extra arguments.
There is more documentation on format, file and **kwargs at file import parameters.
In particular, for “csv” and “tsv” data, you can change how data columns are parsed using the
dtypes
andconverters
arguments.dtypes
is a dictionary used to specify a datatype for individual columns, for instance{"col1": "int"}
. The available datatypes are"bool"
,"float"
,"int"
,"str"
and"guess"
. If a column is also mentioned inconverters
, then the function will be used, NOT the datatype.converters
is a dictionary used to specify a function that will be used to parse individual columns, for instace{"col1", int}
.
The default behaviour is
"guess"
, which makes a best-effort to decide the column datatype. See file import parameters for more details.
-
bulk_import_upload_part
(name, part_name, bytes_or_stream, size)[source]¶ Upload a part to a bulk import session
- Parameters
name (str) – name of a bulk import session
part_name (str) – name of a part of the bulk import session
bytes_or_stream (file-like) – a file-like object contains the part
size (int) – the size of the part
-
bulk_imports
()[source]¶ List bulk import sessions
- Returns
a list of
tdclient.models.BulkImport
-
change_database
(db_name, table_name, new_db_name)[source]¶ Move a target table from it’s original database to new destination database.
- Parameters
db_name (str) – Target database name.
table_name (str) – Target table name.
new_db_name (str) – Destination database name to be moved.
- Returns
True if succeeded.
- Return type
bool
-
commit_bulk_import
(name)[source]¶ Commit a bulk import session
- Parameters
name (str) – name of a bulk import session
- Returns
True if success
-
create_bulk_import
(name, database, table, params=None)[source]¶ Create new bulk import session
- Parameters
name (str) – name of new bulk import session
database (str) – name of a database
table (str) – name of a table
- Returns
-
create_database
(db_name, **kwargs)[source]¶ - Parameters
db_name (str) – name of a database to create
- Returns
True if success
-
create_log_table
(db_name, table_name)[source]¶ - Parameters
db_name (str) – name of a database
table_name (str) – name of a table to create
- Returns
True if success
-
create_result
(name, url, params=None)[source]¶ Create a new authentication with the specified name.
- Parameters
name (str) – Authentication name.
url (str) – Url of the authentication to be created. e.g. “ftp://test.com/”
params (dict, optional) – Extra parameters.
- Returns
True if succeeded.
- Return type
bool
-
create_schedule
(name, params=None)[source]¶ Create a new scheduled query with the specified name.
- Parameters
name (str) – Scheduled query name.
params (dict, optional) –
Extra parameters.
- type (str):
Query type. {“presto”, “hive”}. Default: “hive”
- database (str):
Target database name.
- timezone (str):
Scheduled query’s timezone. e.g. “UTC” For details, see also: https://gist.github.com/frsyuki/4533752
- cron (str, optional):
Schedule of the query. {
"@daily"
,"@hourly"
,"10 * * * *"
(custom cron)} See also: https://support.treasuredata.com/hc/en-us/articles/360001451088-Scheduled-Jobs-Web-Console
- delay (int, optional):
A delay ensures all buffered events are imported before running the query. Default: 0
- query (str):
Is a language used to retrieve, insert, update and modify data. See also: https://support.treasuredata.com/hc/en-us/articles/360012069493-SQL-Examples-of-Scheduled-Queries
- priority (int, optional):
Priority of the query. Range is from -2 (very low) to 2 (very high). Default: 0
- retry_limit (int, optional):
Automatic retry count. Default: 0
- engine_version (str, optional):
Engine version to be used. If none is specified, the account’s default engine version would be set. {“stable”, “experimental”}
- pool_name (str, optional):
For Presto only. Pool name to be used, if not specified, default pool would be used.
- result (str, optional):
Location where to store the result of the query. e.g. ‘tableau://user:password@host.com:1234/datasource’
- Returns
Start date time.
- Return type
datetime.datetime
-
databases
()[source]¶ - Returns
a list of
tdclient.models.Database
-
delete_bulk_import
(name)[source]¶ Delete a bulk import session
- Parameters
name (str) – name of a bulk import session
- Returns
True if success
-
delete_database
(db_name)[source]¶ - Parameters
db_name (str) – name of database to delete
- Returns
True if success
-
delete_result
(name)[source]¶ Delete the authentication having the specified name.
- Parameters
name (str) – Authentication name.
- Returns
True if succeeded.
- Return type
bool
-
delete_schedule
(name)[source]¶ Delete the scheduled query with the specified name.
- Parameters
name (str) – Target scheduled query name.
- Returns
Tuple of cron and query.
- Return type
(str, str)
-
delete_table
(db_name, table_name)[source]¶ Delete a table
- Parameters
db_name (str) – name of a database
table_name (str) – name of a table
- Returns
a string represents the type of deleted table
-
export_data
(db_name, table_name, storage_type, params=None)[source]¶ Export data from Treasure Data Service
- Parameters
db_name (str) – name of a database
table_name (str) – name of a table
storage_type (str) – type of the storage
params (dict) –
optional parameters. Assuming the following keys:
- access_key_id (str):
ID to access the information to be exported.
- secret_access_key (str):
Password for the access_key_id.
- file_prefix (str, optional):
Filename of exported file. Default: “<database_name>/<table_name>”
- file_format (str, optional):
File format of the information to be exported. {“jsonl.gz”, “tsv.gz”, “json.gz”}
- from (int, optional):
From Time of the data to be exported in Unix epoch format.
- to (int, optional):
End Time of the data to be exported in Unix epoch format.
assume_role (str, optional): Assume role.
- bucket (str):
Name of bucket to be used.
- domain_key (str, optional):
Job domain key.
- pool_name (str, optional):
For Presto only. Pool name to be used, if not specified, default pool would be used.
- Returns
-
freeze_bulk_import
(name)[source]¶ Freeze a bulk import session
- Parameters
name (str) – name of a bulk import session
- Returns
True if success
-
history
(name, _from=None, to=None)[source]¶ Get the history details of the saved query for the past 90days.
- Parameters
name (str) – Target name of the scheduled query.
_from (int, optional) – Indicates from which nth record in the run history would be fetched. Default: 0. Note: Count starts from zero. This means that the first record in the list has a count of zero.
to (int, optional) – Indicates up to which nth record in the run history would be fetched. Default: 20
- Returns
-
import_data
(db_name, table_name, format, bytes_or_stream, size, unique_id=None)[source]¶ Import data into Treasure Data Service
- Parameters
db_name (str) – name of a database
table_name (str) – name of a table
format (str) – format of data type (e.g. “msgpack.gz”)
bytes_or_stream (str or file-like) – a byte string or a file-like object contains the data
size (int) – the length of the data
unique_id (str) – a unique identifier of the data
- Returns
second in float represents elapsed time to import data
-
import_file
(db_name, table_name, format, file, unique_id=None)[source]¶ Import data into Treasure Data Service, from an existing file on filesystem.
This method will decompress/deserialize records from given file, and then convert it into format acceptable from Treasure Data Service (“msgpack.gz”).
- Parameters
db_name (str) – name of a database
table_name (str) – name of a table
format (str) – format of data type (e.g. “msgpack”, “json”)
file (str or file-like) – a name of a file, or a file-like object contains the data
unique_id (str) – a unique identifier of the data
- Returns
float represents the elapsed time to import data
-
job_result
(job_id)[source]¶ - Parameters
job_id (str) – job id
- Returns
a list of each rows in result set
-
job_result_format
(job_id, format)[source]¶ - Parameters
job_id (str) – job id
format (str) – output format of result set
- Returns
a list of each rows in result set
-
job_result_format_each
(job_id, format)[source]¶ - Parameters
job_id (str) – job id
format (str) – output format of result set
- Returns
an iterator of rows in result set
-
job_status
(job_id)[source]¶ - Parameters
job_id (str) – job id
- Returns
a string represents the status of the job (“success”, “error”, “killed”, “queued”, “running”)
-
jobs
(_from=None, to=None, status=None, conditions=None)[source]¶ List jobs
- Parameters
_from (int, optional) – Gets the Job from the nth index in the list. Default: 0.
to (int, optional) – Gets the Job up to the nth index in the list. By default, the first 20 jobs in the list are displayed
status (str, optional) – Filter by given status. {“queued”, “running”, “success”, “error”}
conditions (str, optional) – Condition for
TIMESTAMPDIFF()
to search for slow queries. Avoid using this parameter as it can be dangerous.
- Returns
a list of
tdclient.models.Job
-
kill
(job_id)[source]¶ - Parameters
job_id (str) – job id
- Returns
a string represents the status of killed job (“queued”, “running”)
-
list_apikeys
(name)[source]¶ - Parameters
name (str) – name of the user
- Returns
a list of string of API key
-
list_bulk_import_parts
(name)[source]¶ List parts of a bulk import session
- Parameters
name (str) – name of a bulk import session
- Returns
a list of string represents the name of parts
-
partial_delete
(db_name, table_name, to, _from, params=None)[source]¶ Create a job to partially delete the contents of the table with the given time range.
- Parameters
db_name (str) – Target database name.
table_name (str) – Target table name.
to (int) – Time in Unix Epoch format indicating the End date and time of the data to be deleted. Should be set only by the hour. Minutes and seconds values will not be accepted.
_from (int) – Time in Unix Epoch format indicating the Start date and time of the data to be deleted. Should be set only by the hour. Minutes and seconds values will not be accepted.
params (dict, optional) –
Extra parameters.
- pool_name (str, optional):
Indicates the resource pool to execute this job. If not provided, the account’s default resource pool would be used.
- domain_key (str, optional):
Domain key that will be assigned to the partial delete job to be created
- Returns
-
perform_bulk_import
(name)[source]¶ Perform a bulk import session
- Parameters
name (str) – name of a bulk import session
- Returns
-
query
(db_name, q, result_url=None, priority=None, retry_limit=None, type='hive', **kwargs)[source]¶ Run a query on specified database table.
- Parameters
db_name (str) – name of a database
q (str) – a query string
result_url (str) – result output URL. e.g.,
postgresql://<username>:<password>@<hostname>:<port>/<database>/<table>
priority (int or str) – priority (e.g. “NORMAL”, “HIGH”, etc.)
retry_limit (int) – retry limit
type (str) – name of a query engine
- Returns
- Raises
ValueError – if unknown query type has been specified
-
remove_apikey
(name, apikey)[source]¶ - Parameters
name (str) – name of the user
apikey (str) – an API key to remove
- Returns
True if success
-
remove_user
(name)[source]¶ Remove a user
- Parameters
name (str) – name of the user
- Returns
True if success
-
results
()[source]¶ Get the list of all the available authentications.
- Returns
a list of
tdclient.models.Result
-
run_schedule
(name, time, num)[source]¶ Execute the specified query.
- Parameters
name (str) – Target scheduled query name.
time (int) – Time in Unix epoch format that would be set as TD_SCHEDULED_TIME
num (int) – Indicates how many times the query will be executed. Value should be 9 or less.
- Returns
-
swap_table
(db_name, table_name1, table_name2)[source]¶ - Parameters
db_name (str) – name of a database
table_name1 (str) – original table name
table_name2 (str) – table name you want to rename to
- Returns
True if success
-
table
(db_name, table_name)[source]¶ - Parameters
db_name (str) – name of a database
table_name (str) – name of a table
- Returns
- Raises
tdclient.api.NotFoundError – if the table doesn’t exist
-
tables
(db_name)[source]¶ List existing tables
- Parameters
db_name (str) – name of a database
- Returns
a list of
tdclient.models.Table
-
tail
(db_name, table_name, count, to=None, _from=None, block=None)[source]¶ Get the contents of the table in reverse order based on the registered time (last data first).
- Parameters
db_name (str) – Target database name.
table_name (str) – Target table name.
count (int) – Number for record to show up from the end.
to – Deprecated parameter.
_from – Deprecated parameter.
block – Deprecated parameter.
- Returns
Contents of the table.
- Return type
[dict]
-
unfreeze_bulk_import
(name)[source]¶ Unfreeze a bulk import session
- Parameters
name (str) – name of a bulk import session
- Returns
True if success
-
update_expire
(db_name, table_name, expire_days)[source]¶ Set expiration date to a table
- Parameters
db_name (str) – name of a database
table_name (str) – name of a table
epire_days (int) – expiration date in days from today
- Returns
True if success
-
update_schedule
(name, params=None)[source]¶ Update the scheduled query.
- Parameters
name (str) – Target scheduled query name.
params (dict) –
Extra parameteres.
- type (str):
Query type. {“presto”, “hive”}. Default: “hive”
- database (str):
Target database name.
- timezone (str):
Scheduled query’s timezone. e.g. “UTC” For details, see also: https://gist.github.com/frsyuki/4533752
- cron (str, optional):
Schedule of the query. {
"@daily"
,"@hourly"
,"10 * * * *"
(custom cron)} See also: https://support.treasuredata.com/hc/en-us/articles/360001451088-Scheduled-Jobs-Web-Console
- delay (int, optional):
A delay ensures all buffered events are imported before running the query. Default: 0
- query (str):
Is a language used to retrieve, insert, update and modify data. See also: https://support.treasuredata.com/hc/en-us/articles/360012069493-SQL-Examples-of-Scheduled-Queries
- priority (int, optional):
Priority of the query. Range is from -2 (very low) to 2 (very high). Default: 0
- retry_limit (int, optional):
Automatic retry count. Default: 0
- engine_version (str, optional):
Engine version to be used. If none is specified, the account’s default engine version would be set. {“stable”, “experimental”}
- pool_name (str, optional):
For Presto only. Pool name to be used, if not specified, default pool would be used.
- result (str, optional):
Location where to store the result of the query. e.g. ‘tableau://user:password@host.com:1234/datasource’
-
update_schema
(db_name, table_name, schema)[source]¶ Updates the schema of a table
- Parameters
db_name (str) – name of a database
table_name (str) – name of a table
schema (list) –
a dictionary object represents the schema definition (will be converted to JSON) e.g.
[ ["member_id", # column name "string", # data type "mem_id", # alias of the column name ], ["row_index", "long", "row_ind"], ... ]
- Returns
True if success
-
users
()[source]¶ List users
- Returns
a list of
tdclient.models.User
-
property
api
¶ an instance of
tdclient.api.API
-
property
apikey
¶ API key string.
-