Data masking
When you transform, transmit, or use data, you can configure data masking rules to reduce the exposure of sensitive data. This way, you can mitigate the risk of data breaches in an efficient manner.This topic describes how to use functions to mask sensitive data in various scenarios.
Scenario 1: Mask mobile phone numbers
Solution. To mask mobile phone numbers in log entries, you can use the regex_replace function.
example
Raw log entries
iphone: 13012345678
DSL orchestration rule
pythone_set( "sec_iphone", regex_replace(v('iphone'), r"(\d{0,3})\d{4}(\d{4})", replace=r"\1****\2") )
Transformation result
iphone: 13012345678 sec_iphone: 130****5678
Scenario 2: Mask bank card information
Solution. To mask bank card information or credit card information in log entries, you can use the regex_replace function.
example
Raw log entries
content: bank number is 491648411333978312 and credit card number is 4916484113339780
DSL orchestration rule
pythone_set( "bank_number", regex_replace( v('content'), r'([1-9]{1})(\d{11}|\d{13}|\d{14})(\d{4})', replace=r"****\3" ) )
Transformation result
content: bank number is 491648411333978312 and credit card number is 4916484113339780 bank_number: bank number is ****978312 and credit card number is ****9780
Scenario 3: Mask email addresses
Solution. To mask email addresses in log entries, you can use the regex_replace function.
example
Raw log entries
content: email is twiss2345@aliyun.com
DSL orchestration rule
pythone_set( "email_encrypt", regex_replace( v('content'), r'[A-Za-z\d]+([-_.][A-Za-z\d]+)*(@([A-Za-z\d]+[-.])+[A-Za-z\d]{2,4})', replace=r"****\2" ) )
Processed data
content: email is twiss2345@aliyun.com email_encrypt: email is ****@aliyun.com
Scenario 4: Mask AccessKey pairs
Solution. To mask AccessKey pairs in log entries, you can use the regex_replace function.
Example
Raw log entries
content: ak id is rDhc9qxjhIhlBiyphP7buo5yg5h6Eq and ak key is XQr1EPtfnlZLYlQc
DSL orchestration rule
pythone_set( "akid_encrypt", regex_replace(v('content'), r'([a-zA-Z0-9]{4})(([a-zA-Z0-9]{26})|([a-zA-Z0-9]{12}))', replace=r"\1****") )
Transformation result
content: ak id is rDhc9qxjhIhlBiyphP7buo5yg5h6Eq and ak key is XQr1EPtfnlZLYlQc akid_encrypt: ak id is rDhc**** and ak key is XQr1****
Scenario 5:IP 脱敏
脱敏方法日志中包含 IP 信息,可同时运用 regex_replace 函数和 grok 函数,对 IP 地址进行正则捕获后而脱敏。
Example
Raw log entries
content: ip is 192.168.1.1
DSL orchestration rule
pythone_set( "ip_encrypt", regex_replace( v('content'), grok('(%{IP})'), replace=r"****" ) )
Transformation result
content: ip is 192.168.1.1 ip_encrypt: ip is ****
Scenario 6: Mask ID card numbers
Solution. To capture and mask ID card numbers in log entries, you can use the regex_replace function and the Grok function.
Example
Raw log entries
content: Id card is 11010519491231002X
DSL orchestration rule
pythone_set( "id_encrypt", regex_replace( v('id_card'content'), grok('(%{CHINAID})'), replace=r"\1****" ) )
Transformation result
content: Id card is 11010519491231002X id_encrypt: idcard is 110105****
Scenario 7: Mask URLs
Solution. To mask URLs in log entries, you can convert the URLs to plaintext and then use Base64 encoding and decoding functions to transcode the URLs.
Example
Raw log entries
url: https://www.aliyun.com/sls?logstore
DSL orchestration rule
pythone_set("base64_url",base64_encoding(v("url")))
Transformation result
url: https://www.aliyun.com/sls?logstore base64_url: aHR0cHM6Ly93d3cuYWxpeXVuLmNvbS9zbHM/bG9nc3RvcmU=
Note To decode the value of the
base64_url
field, you can use thebase64_decoding(v("base64_url"))
function.
Scenario 8: Mask order numbers
Solution. To mask order numbers in log entries and prevent other users from decoding the order numbers, you can use the MD5 encoding function to encode the order numbers.
Example
Raw log entries
orderId: 15121412314
DSL orchestration rule
pythone_set("md5_orderId",md5_encoding(v("orderId")))
Transformation result
orderId: 15121412314 md5_orderId: 852751f9aa48303a5691b0d020e52a0a
Scenario 9: Mask strings
Solution. To mask key characters or strings in log entries, you can use the str_translate function to configure mapping rules for the characters or strings.
Example
Raw log entries
data: message level is info_
DSL orchestration rule
pythone_set( "data_translate", str_translate( v("data"), "aeiou", "12345" ) )
Transformation result
data: message level is info data_translate: m2ss1g2 l2v2l 3s 3nf4