Jay Taylor's notes

back to listing index

How can I remove the ANSI escape sequences from a string in python

[web search]

Original source (stackoverflow.com)

Tags: python bash command-line terminal-color-codes-stripper terminal stackoverflow.com

Clipped on: 2018-10-30

Home
1. Public
2. Stack Overflow
3. Tags
4. Users
5. Jobs
1. Teams Q&A for work Learn More

How can I remove the ANSI escape sequences from a string in python

Ask Question

up vote 42 down vote favorite

This is my string:

'ls\r\n\x1b[00m\x1b[01;31mexamplefile.zip\x1b[00m\r\n\x1b[01;31m'

I was using code to retrieve the output from a SSH command and I want my string to only contain 'examplefile.zip'

What I can use to remove the extra escape sequences?

share|edit|close|flag

edited Feb 4 '13 at 19:22

Martijn Pieters♦

685k12423512203

asked Feb 4 '13 at 19:07

SpartaSixZero

65241530

1

possible duplicate of Filtering out ANSI escape sequences – fuenfundachtzig Jun 18 '15 at 15:00

add a comment

start a bounty

5 Answers

active oldest votes

up vote 77 down vote accepted

Delete them with a regular expression:

import re

ansi_escape = re.compile(r'\x1B\[[0-?]*[ -/]*[@-~]')
ansi_escape.sub('', sometext)

Demo:

>>> import re
>>> ansi_escape = re.compile(r'\x1B\[[0-?]*[ -/]*[@-~]')
>>> sometext = 'ls\r\n\x1b[00m\x1b[01;31mexamplefile.zip\x1b[00m\r\n\x1b[01;31m'
>>> ansi_escape.sub('', sometext)
'ls\r\nexamplefile.zip\r\n'

(I've tidied up the escape sequence expression to follow the Wikipedia overview of ANSI escape codes, focusing on the CSI sequences, and ignoring the C1 codes as they are never used in today's UTF-8 world).

share|edit|flag

edited Dec 20 '17 at 16:17

answered Feb 4 '13 at 19:12

Martijn Pieters♦

685k12423512203

The line ansi_escape.sub('', sometext) should be assigned to your final variable. – crafter Feb 4 at 14:14
@crafter: that's implied, yes. – Martijn Pieters♦ Feb 4 at 14:23

add a comment

up vote 36 down vote

The accepted answer to this question only considers color and font effects. There are a lot of sequences that do not end in 'm', such as cursor positioning, erasing, and scroll regions.

The complete regexp for Control Sequences (aka ANSI Escape Sequences) is

/(\x9B|\x1B\[)[0-?]*[ -\/]*[@-~]/

Refer to ECMA-48 Section 5.4 and ANSI escape code

share|edit|flag

answered Nov 25 '15 at 20:02

Jeff

1,1131313

1

It misses OSC (both beginning and end). – Thomas Dickey Jul 29 '16 at 21:59
OSC is in ECMA-48 sec. 5.6 - what is the point of bring that up here? – Jeff Aug 4 '16 at 1:36
3

OSC is an "ANSI escape sequence", is frequently used, and would begin with a different pattern. Your answer is incomplete. – Thomas Dickey Aug 4 '16 at 7:57
This doesn't work for color codes produced by bluetoothctl, example: \x1b[0;94m. Making the expression case insensitive or replacing 1B with 1b in the pattern made no difference. I'm using Python and the line re.compile(r'/(\x9b|\x1b\[)[0-?]*[ -\/]*[@-~]/', re.I). Then I'm doing pattern.sub("", my_string) which doesn't accomplish anything. Am I doing something wrong? – Hubro Dec 30 '16 at 8:11
(I was too slow to edit my previous comment). I assume your pattern is using features not available in Python's re module? – Hubro Dec 30 '16 at 8:18

add a comment | show 2 more comments

up vote 18 down vote

Function

Based on Martijn Pieters♦'s answer with Jeff's regexp.

def escape_ansi(line):
    ansi_escape = re.compile(r'(\x9B|\x1B\[)[0-?]*[ -/]*[@-~]')
    return ansi_escape.sub('', line)

Test

def test_remove_ansi_escape_sequence(self):
    line = '\t\u001b[0;35mBlabla\u001b[0m                                  \u001b[0;36m172.18.0.2\u001b[0m'

    escaped_line = escape_ansi(line)

    self.assertEqual(escaped_line, '\tBlabla                                  172.18.0.2')

Testing

If you want to run it by yourself, use python3 (better unicode support, blablabla). Here is how the test file should be:

import unittest
import re

def escape_ansi(line):
    …

class TestStringMethods(unittest.TestCase):
    def test_remove_ansi_escape_sequence(self):
    …

if __name__ == '__main__':
    unittest.main()

share|edit|flag

edited Oct 3 '17 at 14:47

answered Jul 29 '16 at 15:51

Édouard Lopez

17.2k1375122

Why have you left the / escaped in the second to last character set [ -\/]? – Andrew Gelnar Aug 10 '16 at 12:04
1

@AndrewGelnar @ÉdouardLopez [ -/] will suffice. – Rodrigo Martins Oct 1 '17 at 19:09

add a comment

up vote 7 down vote

The suggested regex didn't do the trick for me so I created one of my own. The following is a python regex that I created based on the spec found here

ansi_regex = r'\x1b(' \
             r'(\[\??\d+[hl])|' \
             r'([=<>a-kzNM78])|' \
             r'([\(\)][a-b0-2])|' \
             r'(\[\d{0,2}[ma-dgkjqi])|' \
             r'(\[\d+;\d+[hfy]?)|' \
             r'(\[;?[hf])|' \
             r'(#[3-68])|' \
             r'([01356]n)|' \
             r'(O[mlnp-z]?)|' \
             r'(/Z)|' \
             r'(\d+)|' \
             r'(\[\?\d;\d0c)|' \
             r'(\d;\dR))'
ansi_escape = re.compile(ansi_regex, flags=re.IGNORECASE)

I tested my regex on the following snippet (basically a copy paste from the ascii-table.com page)

\x1b[20h    Set
\x1b[?1h    Set
\x1b[?3h    Set
\x1b[?4h    Set
\x1b[?5h    Set
\x1b[?6h    Set
\x1b[?7h    Set
\x1b[?8h    Set
\x1b[?9h    Set
\x1b[20l    Set
\x1b[?1l    Set
\x1b[?2l    Set
\x1b[?3l    Set
\x1b[?4l    Set
\x1b[?5l    Set
\x1b[?6l    Set
\x1b[?7l    Reset
\x1b[?8l    Reset
\x1b[?9l    Reset
\x1b=   Set
\x1b>   Set
\x1b(A  Set
\x1b)A  Set
\x1b(B  Set
\x1b)B  Set
\x1b(0  Set
\x1b)0  Set
\x1b(1  Set
\x1b)1  Set
\x1b(2  Set
\x1b)2  Set
\x1bN   Set
\x1bO   Set
\x1b[m  Turn
\x1b[0m Turn
\x1b[1m Turn
\x1b[2m Turn
\x1b[4m Turn
\x1b[5m Turn
\x1b[7m Turn
\x1b[8m Turn
\x1b[1;2    Set
\x1b[1A Move
\x1b[2B Move
\x1b[3C Move
\x1b[4D Move
\x1b[H  Move
\x1b[;H Move
\x1b[4;3H   Move
\x1b[f  Move
\x1b[;f Move
\x1b[1;2    Move
\x1bD   Move/scroll
\x1bM   Move/scroll
\x1bE   Move
\x1b7   Save
\x1b8   Restore
\x1bH   Set
\x1b[g  Clear
\x1b[0g Clear
\x1b[3g Clear
\x1b#3  Double-height
\x1b#4  Double-height
\x1b#5  Single
\x1b#6  Double
\x1b[K  Clear
\x1b[0K Clear
\x1b[1K Clear
\x1b[2K Clear
\x1b[J  Clear
\x1b[0J Clear
\x1b[1J Clear
\x1b[2J Clear
\x1b5n  Device
\x1b0n  Response:
\x1b3n  Response:
\x1b6n  Get
\x1b[c  Identify
\x1b[0c Identify
\x1b[?1;20c Response:
\x1bc   Reset
\x1b#8  Screen
\x1b[2;1y   Confidence
\x1b[2;2y   Confidence
\x1b[2;9y   Repeat
\x1b[2;10y  Repeat
\x1b[0q Turn
\x1b[1q Turn
\x1b[2q Turn
\x1b[3q Turn
\x1b[4q Turn
\x1b<   Enter/exit
\x1b=   Enter
\x1b>   Exit
\x1bF   Use
\x1bG   Use
\x1bA   Move
\x1bB   Move
\x1bC   Move
\x1bD   Move
\x1bH   Move
\x1b12  Move
\x1bI  
\x1bK  
\x1bJ  
\x1bZ  
\x1b/Z 
\x1bOP 
\x1bOQ 
\x1bOR 
\x1bOS 
\x1bA  
\x1bB  
\x1bC  
\x1bD  
\x1bOp 
\x1bOq 
\x1bOr 
\x1bOs 
\x1bOt 
\x1bOu 
\x1bOv 
\x1bOw 
\x1bOx 
\x1bOy 
\x1bOm 
\x1bOl 
\x1bOn 
\x1bOM 
\x1b[i 
\x1b[1i
\x1b[4i
\x1b[5i

Hopefully this will help others :)

share|edit|flag

answered Aug 1 '17 at 21:47

kfir

175311

add a comment

up vote -1 down vote

if you want to remove the \r\n bit, you can pass the string through this function (written by sarnold):

def stripEscape(string):
    """ Removes all escape sequences from the input string """
    delete = ""
    i=1
    while (i<0x20):
        delete += chr(i)
        i += 1
    t = string.translate(None, delete)
    return t

Careful though, this will lump together the text in front and behind the escape sequences. So, using Martijn's filtered string 'ls\r\nexamplefile.zip\r\n', you will get lsexamplefile.zip. Note the ls in front of the desired filename.

I would use the stripEscape function first to remove the escape sequences, then pass the output to Martijn's regular expression, which would avoid concatenating the unwanted bit.

share|edit|flag

edited May 23 '17 at 12:34

Community♦