Transform logs in specific text formats
The practice cases in this topic are based on the actual data transformation requirements submitted in tickets during daily work.This topic describes how to use LOG domain-specific language (DSL) orchestration to transform logs to meet the requirements.
Scenario: Convert non-standard JSON objects to JSON data and expand the objects
Assume that you want to perform secondary nesting on the collected dictionary data and expand the data.You can convert the dictionary data to JSON data and then use the e_json
function to expand the data.
Raw log entries
content: { 'referer': '-', 'request': 'GET /phpMyAdmin', 'status': 404, 'data-1': { 'aaa': 'Mozilla', 'bbb': 'asde' }, 'data-2': { 'up_adde': '-', 'up_host': '-' } }
LOG DSL orchestration
Convert data in the
content
field to the JSON format.pythone_set("content_json",str_replace(ct_str(v("content")),"'",'"'))
The log after processing is as follows:
content: { 'referer': '-', 'request': 'GET /phpMyAdmin', 'status': 404, 'data-1': { 'aaa': 'Mozilla', 'bbb': 'asde' }, 'data-2': { 'up_adde': '-', 'up_host': '-' } } content_json: { "referer": "-", "request": "GET /phpMyAdmin", "status": 404, "data-1": { "aaa": "Mozilla", "bbb": "asde" }, "data-2": { "up_adde": "-", "up_host": "-" } }
Expand the standardized
content_json
data generated after the preceding processing.Expand the standardizedcontent_json
data generated after the preceding processing.pythone_json("content_json",depth=1,fmt='full')
The log after expanding is as follows:
content_json.data-1.data-1: {"aaa": "Mozilla", "bbb": "asde"} content_json.data-2.data-2: {"up_adde": "-", "up_host": "-"} content_json.referer: - content_json.request: GET /phpMyAdmin content_json.status: 404
setting the depth
parameter to 2, the following log is generated:
content_json.data-1.aaa: Mozilla
content_json.data-1.bbb: asde
content_json.data-2.up_adde: -
content_json.data-2.up_host: -
content_json.referer: -
content_json.request: GET /phpMyAdmin
content_json.status: 404
- To sum up, use the following LOG DSL rules:python
e_set( "content_json", str_replace(ct_str(v("content")),"'",'"') ) e_json("content_json",depth=2,fmt='full')
Log after transformation. After the log is transformed by setting the
depth
parameter to 2, the following log is generated:content: { 'referer': '-', 'request': 'GET /phpMyAdmin', 'status': 404, 'data-1': { 'aaa': 'Mozilla', 'bbb': 'asde' }, 'data-2': { 'up_adde': '-', 'up_host': '-' } } content_json: { "referer": "-", "request": "GET /phpMyAdmin", "status": 404, "data-1": { "aaa": "Mozilla", "bbb": "asde" }, "data-2": { "up_adde": "-", "up_host": "-" } } content_json.data-1.aaa: Mozilla content_json.data-1.bbb: asde content_json.data-2.up_adde: - content_json.data-2.up_host: - content_json.referer: - content_json.request: GET /phpMyAdmin content_json.status: 404
Convert log data in other text formats to data in the JSON format and expand the data
To expand non-standard JSON data, you can flexibly combine rules.
Raw log entries
content : { "pod" => { "name" => "crm-learning-follow-7bc48f8b6b-m6kgb" }, "node" => { "name" => "tw5" }, "labels" => { "pod-template-hash" => "7bc48f8b6b", "app" => "crm-learning-follow" }, "container" => { "name" => "crm-learning-follow" }, "namespace" => "testing1" }
LOG DSL orchestration
Convert the log data to data in the JSON format by using the
str_logtash_config_normalize
function.pythone_set( "normalize_data", str_logtash_config_normalize( v("content") ) )
Use a JSON function to expand the data.
pythone_json("normalize_data",depth=1,fmt='full')
To sum up, use the following LOG DSL rules:
pythone_set( "normalize_data", str_logtash_config_normalize( v("content") ) ) e_json( "normalize_data", depth=1, fmt='full' )
Log after transformation.
content : { "pod" => { "name" => "crm-learning-follow-7bc48f8b6b-m6kgb" }, "node" => { "name" => "tw5" }, "labels" => { "pod-template-hash" => "7bc48f8b6b", "app" => "crm-learning-follow" }, "container" => { "name" => "crm-learning-follow" }, "namespace" => "testing1" } normalize_data: { "pod": { "name": "crm-learning-follow-7bc48f8b6b-m6kgb" }, "node": { "name": "tw5" }, "labels": { "pod-template-hash": "7bc48f8b6b", "app": "crm-learning-follow" }, "container": { "name": "crm-learning-follow" }, "namespace": "testing1" } normalize_data.container.container: {"name": "crm-learning-follow"} normalize_data.labels.labels: {"pod-template-hash": "7bc48f8b6b", "app": "crm-learning-follow"} normalize_data.namespace: testing1 normalize_data.node.node: {"name": "tw5"} normalize_data.pod.pod: {"name": "crm-learning-follow-7bc48f8b6b-m6kgb"}
Convert text written in special encoding formats
Hexadecimal characters that are recorded in daily work need to be decoded before they can be read.Use the str_hex_escape_encode
function to escape hexadecimal characters.
Raw log entries
content : "\xe4\xbd\xa0\xe5\xa5\xbd"
LOG DSL orchestration
pythone_set("hex_encode",str_hex_escape_encode(v("content")))
Log after transformation.
content : "\xe4\xbd\xa0\xe5\xa5\xbd" hex_encode : "hello"
Expand XML fields
You may encounter various types of data during your daily work, such as XML data.If you want to convert XML data to JSON data, you can use the xml_to_json
function.
Test logs
str : <?xmlversion="1.0"?> <data> <countryname="Liechtenstein"> <rank>1</rank> <year>2008</year> <gdppc>141100</gdppc> <neighborname="Austria"direction="E"/> <neighborname="Switzerland"direction="W"/> </country> <countryname="Singapore"> <rank>4</rank> <year>2011</year> <gdppc>59900</gdppc> <neighborname="Malaysia"direction="N"/> </country> <countryname="Panama"> <rank>68</rank> <year>2011</year> <gdppc>13600</gdppc> <neighborname="Costa Rica"direction="W"/> <neighborname="Colombia"direction="E"/> </country> </data>
LOG DSL orchestration
pythone_set("str_json",xml_to_json(v("str")))
Post processing logs
str : <?xmlversion="1.0"?> <data> <countryname="Liechtenstein"> <rank>1</rank> <year>2008</year> <gdppc>141100</gdppc> <neighborname="Austria"direction="E"/> <neighborname="Switzerland"direction="W"/> </country> <countryname="Singapore"> <rank>4</rank> <year>2011</year> <gdppc>59900</gdppc> <neighborname="Malaysia"direction="N"/> </country> <countryname="Panama"> <rank>68</rank> <year>2011</year> <gdppc>13600</gdppc> <neighborname="Costa Rica"direction="W"/> <neighborname="Colombia"direction="E"/> </country> </data> str_dict :{ "data": { "country": [{ "@name": "Liechtenstein", "rank": "1", "year": "2008", "gdppc": "141100", "neighbor": [{ "@name": "Austria", "@direction": "E" }, { "@name": "Switzerland", "@direction": "W" }] }, { "@name": "Singapore", "rank": "4", "year": "2011", "gdppc": "59900", "neighbor": { "@name": "Malaysia", "@direction": "N" } }, { "@name": "Panama", "rank": "68", "year": "2011", "gdppc": "13600", "neighbor": [{ "@name": "Costa Rica", "@direction": "W" }, { "@name": "Colombia", "@direction": "E" }] }] } }