Jay Taylor's notes
back to listing indexBashPitfalls - Greg's Wiki
[web search]most of the time
word=abcde expr "$word" : ".\(.*\)" bcde
But WILL fail for the word "match"
word=match expr "$word" : ".\(.*\)"
The problem is "match" is a keyword. Solution (GNU only) is prefix with a '+'
word=match expr + "$word" : ".\(.*\)" atch
Or, y'know, stop using expr. You can do everything expr does by using Parameter Expansion. What's that thing up there trying to do? Remove the first letter of a word? That can be done in POSIX shells using PE or Substring Expansion:
$ word=match $ echo "${word#?}" # PE atch $ echo "${word:1}" # SE atch
Seriously, there's no excuse for using expr unless you're on Solaris with its non-POSIX-conforming /bin/sh. It's an external process, so it's much slower than in-process string manipulation. And since nobody uses it, nobody understands what it's doing, so your code is obfuscated and hard to maintain.
40. On UTF-8 and Byte-Order Marks (BOM)
In general: Unix UTF-8 text does not use BOM. The encoding of plain text is determined by the locale or by mime types or other metadata. While the presence of a BOM would not normally damage a UTF-8 document meant only for reading by humans, it is problematic (often syntactically illegal) in any text file meant to be interpreted by automated processes such as scripts, source code, configuration files, and so on. Files starting with BOM should be considered equally foreign as those with MS-DOS linebreaks.
In shell scripting: 'Where UTF-8 is used transparently in 8-bit environments, the use of a BOM will interfere with any protocol or file format that expects specific ASCII characters at the beginning, such as the use of "#!" of at the beginning of Unix shell scripts.'
BashPitfalls (last edited 2020-06-18 19:11:42 by GreyCat)