Misc¶
tdclient.errors¶
-
exception
tdclient.errors.
AlreadyExistsError
[source]¶ Bases:
tdclient.errors.APIError
-
exception
tdclient.errors.
AuthError
[source]¶ Bases:
tdclient.errors.APIError
-
exception
tdclient.errors.
DatabaseError
[source]¶ Bases:
tdclient.errors.Error
-
exception
tdclient.errors.
ForbiddenError
[source]¶ Bases:
tdclient.errors.APIError
-
exception
tdclient.errors.
InterfaceError
[source]¶ Bases:
tdclient.errors.Error
-
exception
tdclient.errors.
NotFoundError
[source]¶ Bases:
tdclient.errors.APIError
tdclient.util¶
-
tdclient.util.
create_msgpack
(items)[source]¶ Create msgpack streaming bytes from list
- Parameters
items (list of dict) – target list
- Returns
Converted msgpack streaming (bytes)
Examples
>>> t1 = int(time.time()) >>> l1 = [{"a": 1, "b": 2, "time": t1}, {"a":3, "b": 6, "time": t1}] >>> create_msgpack(l1) b'\x83\xa1a\x01\xa1b\x02\xa4time\xce]\xa5X\xa1\x83\xa1a\x03\xa1b\x06\xa4time\xce]\xa5X\xa1'
-
tdclient.util.
create_url
(tmpl, **values)[source]¶ Create url with values
- Parameters
tmpl (str) – url template
values (dict) – values for url
-
tdclient.util.
csv_dict_record_reader
(file_like, encoding, dialect)[source]¶ Yield records from a CSV input using csv.DictReader.
This is a reader suitable for use by tdclient.util.read_csv_records.
It is used to read CSV data when the column names are read from the first row in the CSV data.
- Parameters
file_like – acts like an instance of io.BufferedIOBase. Reading from it returns bytes.
encoding (str) – the name of the encoding to use when turning those bytes into strings.
dialect (str) – the name of the CSV dialect to use.
- Yields
For each row of CSV data read from file_like, yields a dictionary whose keys are column names (determined from the first row in the CSV data) and whose values are the column values.
-
tdclient.util.
csv_text_record_reader
(file_like, encoding, dialect, columns)[source]¶ Yield records from a CSV input using csv.reader and explicit column names.
This is a reader suitable for use by tdclient.util.read_csv_records.
It is used to read CSV data when the column names are supplied as an explicit columns parameter.
- Parameters
file_like – acts like an instance of io.BufferedIOBase. Reading from it returns bytes.
encoding (str) – the name of the encoding to use when turning those bytes into strings.
dialect (str) – the name of the CSV dialect to use.
- Yields
For each row of CSV data read from file_like, yields a dictionary whose keys are column names (determined by columns) and whose values are the column values.
-
tdclient.util.
get_or_else
(hashmap, key, default_value=None)[source]¶ Get value or default value
It differs from the standard dict
get
method in its behaviour when key is present but has a value that is an empty string or a string of only spaces.- Parameters
hashmap (dict) – target
key (Any) – key
default_value (Any) – default value
Example
>>> get_or_else({'k': 'nonspace'}, 'k', 'default') 'nonspace' >>> get_or_else({'k': ''}, 'k', 'default') 'default' >>> get_or_else({'k': ' '}, 'k', 'default') 'default'
- Returns
The value of key or default_value
-
tdclient.util.
guess_csv_value
(s)[source]¶ Determine the most appropriate type for s and return it.
Tries to interpret s as a more specific datatype, in the following order, and returns the first that succeeds:
As an integer
As a floating point value
If it is “false” or “true” (case insensitive), then as a boolean
If it is “” or “none” or “null” (case insensitive), then as None
As the string itself, unaltered
- Parameters
s (str) – a string value, assumed to have been read from a CSV file.
- Returns
A good guess at a more specific value (int, float, str, bool or None)
-
tdclient.util.
merge_dtypes_and_converters
(dtypes=None, converters=None)[source]¶ Generate a merged dictionary from those given.
- Parameters
dtypes (optional dict) – A dictionary mapping column name to “dtype” (datatype), where “dtype” may be any of the strings ‘bool’, ‘float’, ‘int’, ‘str’ or ‘guess’.
converters (optional dict) – A dictionary mapping column name to a callable. The callable should take a string as its single argument, and return the result of parsing that string.
Internally, the dtypes dictionary is converted to a temporary dictionary of the same form as converters - that is, mapping column names to callables. The “data type” string values in dtypes are converted to the Python builtins of the same name, and the value “guess” is converted to the tdclient.util.guess_csv_value callable.
Example
>>> merge_dtypes_and_converters( ... dtypes={'col1': 'int', 'col2': 'float'}, ... converters={'col2': int}, ... ) {'col1': int, 'col2': int}
- Returns
(dict) A dictionary which maps column names to callables. If a column name occurs in both input dictionaries, the callable specified in converters is used.
-
tdclient.util.
normalize_connector_config
(config)[source]¶ Normalize connector config
This is porting of TD CLI’s ConnectorConfigNormalizer#normalized_config. see also: https://github.com/treasure-data/td/blob/15495f12d8645a7b3f6804098f8f8aca72de90b9/lib/td/connector_config_normalizer.rb#L7-L30
- Parameters
config (dict) – A config to be normalized
- Returns
Normalized configuration
- Return type
dict
Examples
Only with
in
key in a config. >>> config = {“in”: {“type”: “s3”}} >>> normalize_connector_config(config) {‘in’: {‘type’: ‘s3’}, ‘out’: {}, ‘exec’: {}, ‘filters’: []}With
in
,out
,exec
, andfilters
in a config. >>> config = { … “in”: {“type”: “s3”}, … “out”: {“mode”: “append”}, … “exec”: {“guess_plugins”: [“json”]}, … “filters”: [{“type”: “speedometer”}], … } >>> normalize_connector_config(config) {‘in’: {‘type’: ‘s3’}, ‘out’: {‘mode’: ‘append’}, ‘exec’: {‘guess_plugins’: [‘json’]}, ‘filters’: [{‘type’: ‘speedometer’}]}
-
tdclient.util.
normalized_msgpack
(value)[source]¶ Recursively convert int to str if the int “overflows”.
- Parameters
value (list, dict, int, float, str, bool or None) – value to be normalized
If value is a list, then all elements in the list are (recursively) normalized.
If value is a dictionary, then all the dictionary keys and values are (recursively) normalized.
If value is an integer, and outside the range
-(1 << 63)
to(1 << 64)
, then it is converted to a string.Otherwise, value is returned unchanged.
- Returns
Normalized value
-
tdclient.util.
parse_csv_value
(k, s, converters=None)[source]¶ Given a CSV (string) value, work out an actual value.
- Parameters
k (str) – The name of the column that the value belongs to.
s (str) – The value as read from the CSV input.
converters (optional dict) – A dictionary mapping column name to callable.
If converters is given, and there is a key matching k in converters, then
converters[k](s)
will be called to work out the return value. Otherwise, tdclient.util.guess_csv_value will be called with s as its argument.Warning
No attempt is made to cope with any errors occurring in a callable from the converters dictionary. So if
int
is called on the string"not-an-int"
the resultingValueError
is not caught.Example
>>> repr(parse_csv_value('col1', 'A string')) 'A string' >>> repr(parse_csv_value('col1', '10')) 10 >>> repr(parse_csv_value('col1', '10', {'col1': float, 'col2': int})) 10.0
- Returns
The value for the CSV column, after parsing by a callable from converters, or after parsing by tdclient.util.guess_csv_value.
-
tdclient.util.
parse_date
(s)[source]¶ Parse date from str to datetime
TODO: parse datetime using an optional format string
For now, this does not use a format string since API may return date in ambiguous format :(
- Parameters
s (str) – target str
- Returns
datetime
-
tdclient.util.
read_csv_records
(csv_reader, dtypes=None, converters=None, **kwargs)[source]¶ Read records using csv_reader and yield the results.
-
tdclient.util.
validate_record
(record)[source]¶ Check that record contains a key called “time”.
- Parameters
record (dict) – a dictionary representing a data record, where the
name the "columns". (keys) –
- Returns
True if there is a key called “time” (it actually checks for
"time"
(a string) andb"time"
(a binary)). False if there is no key called “time”.