Jay Taylor's notes
back to listing indexHow can I remove the ANSI escape sequences from a string in python
[web search]- Home
-
- Public
- Stack Overflow
- Tags
- Users
- Jobs
-
This is my string:
'ls\r\n\x1b[00m\x1b[01;31mexamplefile.zip\x1b[00m\r\n\x1b[01;31m'
I was using code to retrieve the output from a SSH command and I want my string to only contain 'examplefile.zip'
What I can use to remove the extra escape sequences?
Delete them with a regular expression:
import re
ansi_escape = re.compile(r'\x1B\[[0-?]*[ -/]*[@-~]')
ansi_escape.sub('', sometext)
Demo:
>>> import re
>>> ansi_escape = re.compile(r'\x1B\[[0-?]*[ -/]*[@-~]')
>>> sometext = 'ls\r\n\x1b[00m\x1b[01;31mexamplefile.zip\x1b[00m\r\n\x1b[01;31m'
>>> ansi_escape.sub('', sometext)
'ls\r\nexamplefile.zip\r\n'
(I've tidied up the escape sequence expression to follow the Wikipedia overview of ANSI escape codes, focusing on the CSI sequences, and ignoring the C1 codes as they are never used in today's UTF-8 world).
-
The line ansi_escape.sub('', sometext) should be assigned to your final variable. – crafter Feb 4 at 14:14
-
The accepted answer to this question only considers color and font effects. There are a lot of sequences that do not end in 'm', such as cursor positioning, erasing, and scroll regions.
The complete regexp for Control Sequences (aka ANSI Escape Sequences) is
/(\x9B|\x1B\[)[0-?]*[ -\/]*[@-~]/
Refer to ECMA-48 Section 5.4 and ANSI escape code
-
-
-
OSC is an "ANSI escape sequence", is frequently used, and would begin with a different pattern. Your answer is incomplete. – Thomas Dickey Aug 4 '16 at 7:57
-
This doesn't work for color codes produced by
bluetoothctl
, example:\x1b[0;94m
. Making the expression case insensitive or replacing1B
with1b
in the pattern made no difference. I'm using Python and the linere.compile(r'/(\x9b|\x1b\[)[0-?]*[ -\/]*[@-~]/', re.I)
. Then I'm doingpattern.sub("", my_string)
which doesn't accomplish anything. Am I doing something wrong? – Hubro Dec 30 '16 at 8:11 -
(I was too slow to edit my previous comment). I assume your pattern is using features not available in Python's
re
module? – Hubro Dec 30 '16 at 8:18
Function
Based on Martijn Pieters♦'s answer with Jeff's regexp.
def escape_ansi(line):
ansi_escape = re.compile(r'(\x9B|\x1B\[)[0-?]*[ -/]*[@-~]')
return ansi_escape.sub('', line)
Test
def test_remove_ansi_escape_sequence(self):
line = '\t\u001b[0;35mBlabla\u001b[0m \u001b[0;36m172.18.0.2\u001b[0m'
escaped_line = escape_ansi(line)
self.assertEqual(escaped_line, '\tBlabla 172.18.0.2')
Testing
If you want to run it by yourself, use python3
(better unicode support, blablabla). Here is how the test file should be:
import unittest
import re
def escape_ansi(line):
…
class TestStringMethods(unittest.TestCase):
def test_remove_ansi_escape_sequence(self):
…
if __name__ == '__main__':
unittest.main()
-
Why have you left the
/
escaped in the second to last character set[ -\/]
? – Andrew Gelnar Aug 10 '16 at 12:04 -
The suggested regex didn't do the trick for me so I created one of my own. The following is a python regex that I created based on the spec found here
ansi_regex = r'\x1b(' \
r'(\[\??\d+[hl])|' \
r'([=<>a-kzNM78])|' \
r'([\(\)][a-b0-2])|' \
r'(\[\d{0,2}[ma-dgkjqi])|' \
r'(\[\d+;\d+[hfy]?)|' \
r'(\[;?[hf])|' \
r'(#[3-68])|' \
r'([01356]n)|' \
r'(O[mlnp-z]?)|' \
r'(/Z)|' \
r'(\d+)|' \
r'(\[\?\d;\d0c)|' \
r'(\d;\dR))'
ansi_escape = re.compile(ansi_regex, flags=re.IGNORECASE)
I tested my regex on the following snippet (basically a copy paste from the ascii-table.com page)
\x1b[20h Set
\x1b[?1h Set
\x1b[?3h Set
\x1b[?4h Set
\x1b[?5h Set
\x1b[?6h Set
\x1b[?7h Set
\x1b[?8h Set
\x1b[?9h Set
\x1b[20l Set
\x1b[?1l Set
\x1b[?2l Set
\x1b[?3l Set
\x1b[?4l Set
\x1b[?5l Set
\x1b[?6l Set
\x1b[?7l Reset
\x1b[?8l Reset
\x1b[?9l Reset
\x1b= Set
\x1b> Set
\x1b(A Set
\x1b)A Set
\x1b(B Set
\x1b)B Set
\x1b(0 Set
\x1b)0 Set
\x1b(1 Set
\x1b)1 Set
\x1b(2 Set
\x1b)2 Set
\x1bN Set
\x1bO Set
\x1b[m Turn
\x1b[0m Turn
\x1b[1m Turn
\x1b[2m Turn
\x1b[4m Turn
\x1b[5m Turn
\x1b[7m Turn
\x1b[8m Turn
\x1b[1;2 Set
\x1b[1A Move
\x1b[2B Move
\x1b[3C Move
\x1b[4D Move
\x1b[H Move
\x1b[;H Move
\x1b[4;3H Move
\x1b[f Move
\x1b[;f Move
\x1b[1;2 Move
\x1bD Move/scroll
\x1bM Move/scroll
\x1bE Move
\x1b7 Save
\x1b8 Restore
\x1bH Set
\x1b[g Clear
\x1b[0g Clear
\x1b[3g Clear
\x1b#3 Double-height
\x1b#4 Double-height
\x1b#5 Single
\x1b#6 Double
\x1b[K Clear
\x1b[0K Clear
\x1b[1K Clear
\x1b[2K Clear
\x1b[J Clear
\x1b[0J Clear
\x1b[1J Clear
\x1b[2J Clear
\x1b5n Device
\x1b0n Response:
\x1b3n Response:
\x1b6n Get
\x1b[c Identify
\x1b[0c Identify
\x1b[?1;20c Response:
\x1bc Reset
\x1b#8 Screen
\x1b[2;1y Confidence
\x1b[2;2y Confidence
\x1b[2;9y Repeat
\x1b[2;10y Repeat
\x1b[0q Turn
\x1b[1q Turn
\x1b[2q Turn
\x1b[3q Turn
\x1b[4q Turn
\x1b< Enter/exit
\x1b= Enter
\x1b> Exit
\x1bF Use
\x1bG Use
\x1bA Move
\x1bB Move
\x1bC Move
\x1bD Move
\x1bH Move
\x1b12 Move
\x1bI
\x1bK
\x1bJ
\x1bZ
\x1b/Z
\x1bOP
\x1bOQ
\x1bOR
\x1bOS
\x1bA
\x1bB
\x1bC
\x1bD
\x1bOp
\x1bOq
\x1bOr
\x1bOs
\x1bOt
\x1bOu
\x1bOv
\x1bOw
\x1bOx
\x1bOy
\x1bOm
\x1bOl
\x1bOn
\x1bOM
\x1b[i
\x1b[1i
\x1b[4i
\x1b[5i
Hopefully this will help others :)
if you want to remove the \r\n
bit, you can pass the string through this function (written by sarnold):
def stripEscape(string):
""" Removes all escape sequences from the input string """
delete = ""
i=1
while (i<0x20):
delete += chr(i)
i += 1
t = string.translate(None, delete)
return t
Careful though, this will lump together the text in front and behind the escape sequences. So, using Martijn's filtered string 'ls\r\nexamplefile.zip\r\n'
, you will get lsexamplefile.zip
. Note the ls
in front of the desired filename.
I would use the stripEscape function first to remove the escape sequences, then pass the output to Martijn's regular expression, which would avoid concatenating the unwanted bit.
Your Answer
Not the answer you're looking for? Browse other questions tagged python string escaping ansi-escape or ask your own question.
asked |
5 years, 8 months ago |
viewed |
29,888 times |
active |
Linked
Related
Hot Network Questions
-
How does Capitalism facilitate happiness?
-
Suspiciouser and suspiciouser
-
Derivation of the tangent half angle identity
-
Can a lecturer force you to learn a specific programming syntax / language?
-
Examples of harmless mistakes (on purpose) in submitted papers
-
How many hexagonal paths?
-
What celestial body did NASA carve and why is it eating a spacecraft?
-
Why is progressive taxation achieved with brackets?
-
What are the chances to catch rodent-transmitted illnesses while hiking?
-
Why is "won't" used instead of "doesn't" sometimes?
-
Is it sourced that Avrohom Avinu charged his guests payment if they did not make a Brachah?
-
Why was Hank tagging trees in the college?
-
The first McDonald's restaurant on Mars
-
Family of GLM represents the distribution of the response variable or residuals?
-
Does Shuckle really produce Rare Candies?
-
What other problems does hot desking solve?
-
Is a Real-Time Clock (RTC) necessary for real-time systems?
-
What should I do when my students use their phones in class?
-
Monopoly and component limits
-
A Chess Lock Puzzle?
-
Can an Aasimar grow their wings in an Antimagic Field?
-
Can my character collect royalties from being an author?
-
'For Sale by Owner' - who pays the Buyer's realtor?
-
Why isn't my superhuman always hungry?
site design / logo © 2018 Stack Exchange Inc; user contributions licensed under cc by-sa 3.0 with attribution required. rev 2018.10.30.31985