Jay Taylor's notes

back to listing index

How to capture ordered STDOUT/STDERR and add timestamp/prefixes?

[web search]
Original source (unix.stackexchange.com)
Tags: bash linux howto shell-scripting streams unix.stackexchange.com
Clipped on: 2015-12-11

I have explored almost all available similar questions, to no avail.

Let me describe the problem in detail:

I run some unattended scripts and these can produce standard output and standard error lines, I want to capture them in their precise order as displayed by a terminal emulator and then add a prefix like "STDERR: " and "STDOUT: " to them.

I have tried using pipes and even epoll-based approach on them, to no avail. I think solution is in pty usage, although I am no master at that. I have also peeked into the source code of Gnome's VTE, but that has not been much productive.

Ideally I would use Go instead of Bash to accomplish this, but I have not been able to. Seems like pipes automatically forbid keeping a correct lines order because of buffering.

Has somebody been able to do something similar? Or it is just impossible? I think that if a terminal emulator can do it, then it's not - maybe by creating a small C program handling the PTY(s) differently?

Ideally I would use asynchronous input to read these 2 streams (STDOUT and STDERR) and then re-print them second my needs, but order of input is crucial!

NOTE: I am aware of stderred but it does not work for me with Bash scripts and cannot be easily edited to add a prefix (since it basically wraps plenty of syscalls).

Update: added below two gists

(sub-second random delays can be added in sample script I provided to prove a consistent result)

Update: solution to this question would also solve this other question, as @Gilles pointed out. However I have come to the conclusion that it's not possible to do what asked here and there. When using 2>&1 both streams are correctly merged at the pty/pipe level, but to use the streams separately and in correct order one should indeed use the approach of stderred that involes syscall hooking and can be seen as dirty in many ways.

I will be eager to update this question if somebody can disproof the above.

asked Sep 26 '14 at 10:26
Image (Asset 2/4) alt=
Deim0s
43528
   upvote
  flag
   upvote
  flag
@slm probably not, since OP needs to prepend different strings to different streams. – peterph Sep 26 '14 at 12:09
   upvote
  flag
Can you share why the order is so important? Maybe there could be some other way around your problem... – peterph Sep 26 '14 at 12:12
   upvote
  flag
@peterph it's a prerequisite, if I can't have consistent output I'd rather send it to /dev/null than read it and get confused by it :) 2>&1 preserves order for example, but doesn't allow the kind of customization that I ask in this question – Deim0s Sep 26 '14 at 12:27
1 upvote
  flag

You might use coprocesses. Simple wrapper that feeds both outputs of a given command to two sed instances (one for stderr the other for stdout), which do the tagging.

#!/bin/bash
exec 3>&1
coproc SEDo ( sed "s/^/STDOUT: /" >&3 )
exec 4>&2-
coproc SEDe ( sed "s/^/STDERR: /" >&4 )
eval $@ 2>&${SEDe[1]} 1>&${SEDo[1]}
eval exec "${SEDo[1]}>&-"
eval exec "${SEDe[1]}>&-"

Note several things:

  1. It is a magic incantation for many people (including me) - for a reason (see the linked answer below).

  2. There is no guarantee it won't occasionally swap couple of lines - it all depends on scheduling of the coprocesses. Actually, it is almost guaranteed that at some point in time it will. That said, if keeping the order strictly the same, you have to process the data from both stderr and stdin in the same process, otherwise the kernel scheduler can (and will) make a mess of it.

    If I understand the problem correctly, it means that you would need to instruct the shell to redirect both streams to one process (which can be done AFAIK). The trouble starts when that process starts deciding what to act upon first - it would have to poll both data sources and at some point get into state where it would be processing one stream and data arrive to both streams before it finishes. And that is exactly where it breaks down. It also means, that wrapping the output syscalls like stderred is probably the only way to achieve your desired outcome (and even then you might have a problem once something becomes multithreaded on a multiprocessor system).

As far as coprocesses be sure to read Stéphane's excellent answer in How do you use the command coproc in Bash? for in depth insight.

answered Sep 26 '14 at 10:56
Image (Asset 3/4) alt=
peterph
15.7k12136
   upvote
  flag
Thanks @peterph for your answer, however I am looking specifically for ways to preserve the order. Note: I think your interpreter should be bash because of the process substitution you use (I get ./test1.sh: 3: ./test1.sh: Syntax error: "(" unexpected by copy/pasting your script) – Deim0s Sep 26 '14 at 11:32
   upvote
  flag
Very likely so, I ran it in bash with /bin/sh (not sure why I had it there). – peterph Sep 26 '14 at 11:36
   upvote
  flag
I've updated the question a bit, regarding where the stream mix-up could happen. – peterph Sep 26 '14 at 12:33
   upvote
  flag
Thanks @peterph, however I have not given up on alternatives to total-syscall-wrapping and I'd like to explore how this is done in VTE library and in general with ptys. Perhaps nobody has tried those routes or found already that syscall wrapping is the only way; the reason behind it is still interesting to me (remember: on a terminal emulator looks good...) – Deim0s Sep 26 '14 at 12:38
   upvote
  flag
What you see on terminal is basically what you get with 2>&1 - see ls -al /proc/$$/fd. thr two streams end in the same sink. – peterph Sep 26 '14 at 14:32

Method #1. Using file descriptors and awk

What about something like this using the solutions from this SO Q&A titled: Is there a Unix utility to prepend timestamps to lines of text? and this SO Q&A titled: pipe STDOUT and STDERR to two different processes in shell script?.

The approach

Step 1, we create 2 functions in Bash that will perform the timestamp message when called.

$ msgOut () {  awk '{ print strftime("STDOUT: %Y-%m-%d %H:%M:%S"), $0; fflush(); }'; }
$ msgErr () {  awk '{ print strftime("STDERR: %Y-%m-%d %H:%M:%S"), $0; fflush(); }'; }

Step 2 you'd use the above functions like so to get the desired messaging:

$ { { ...command/script... } 2>&3; } 2>&3 | msgErr; } 3>&1 1>&2 | msgOut

Example

Here I've concocted an example that will write a to STDOUT, sleeps for 10 seconds, and then writes output to STDERR. When we put this command sequence into our construct above we get messaging as you specified.

$ { { echo a; sleep 10; echo >&2 b; } 2>&3 | \
    msgErr; } 3>&1 1>&2 | msgOut
STDERR: 2014-09-26 09:22:12 a
STDOUT: 2014-09-26 09:22:22 b

Method #2. Using annotate-output

There's a tool called annotate-output that's part of the devscripts package that will do what you want. It's only restriction is that it must run the scripts for you.

Example

If we put our above example command sequence into a script called mycmds.bash like so:

$ cat mycmds.bash 
#!/bin/bash

echo a
sleep 10
echo >&2 b

We can then run it like this:

$ annotate-output ./mycmds.bash 
09:48:00 I: Started ./mycmds.bash
09:48:00 O: a
09:48:10 E: b
09:48:10 I: Finished with exitcode 0

The output's format can be controlled for the timestamp portion but not beyond that. But it's similar output to what you're looking for, so it may fit the bill.

answered Sep 26 '14 at 13:29
Image (Asset 4/4) alt=
slm
137k26209380
1 upvote
  flag
unfortunately this also doesn't solve the problem of possibly swapping some lines. – peterph Sep 26 '14 at 14:33
   upvote
  flag
exactly. I think the answer to this question of mine is "not possible". Event with stderred you cannot easily determine boundaries of lines (trying so would be hackish). I wanted to see if somebody could help me with this problem but apparently everybody wants to give up the single constraint (order) that is the basis for the question – Deim0s Sep 26 '14 at 16:24

Your Answer

asked

1 year ago

viewed

1697 times

active

1 year ago

Hot Network Questions

Technology Life / Arts Culture / Recreation Science Other
  1. Stack Overflow
  2. Server Fault
  3. Super User
  4. Web Applications
  5. Ask Ubuntu
  6. Webmasters
  7. Game Development
  8. TeX - LaTeX
  1. Programmers
  2. Unix & Linux
  3. Ask Different (Apple)
  4. WordPress Development
  5. Geographic Information Systems
  6. Electrical Engineering
  7. Android Enthusiasts
  8. Information Security
  1. Database Administrators
  2. Drupal Answers
  3. SharePoint
  4. User Experience
  5. Mathematica
  6. Salesforce
  7. ExpressionEngine® Answers
  8. more (13)
  1. Photography
  2. Science Fiction & Fantasy
  3. Graphic Design
  4. Movies & TV
  5. Seasoned Advice (cooking)
  6. Home Improvement
  7. Personal Finance & Money
  8. Academia
  9. more (9)
  1. English Language & Usage
  2. Skeptics
  3. Mi Yodeya (Judaism)
  4. Travel
  5. Christianity
  6. Arqade (gaming)
  7. Bicycles
  8. Role-playing Games
  9. more (21)
  1. Mathematics
  2. Cross Validated (stats)
  3. Theoretical Computer Science
  4. Physics
  5. MathOverflow
  6. Chemistry
  7. Biology
  8. more (5)
  1. Stack Apps
  2. Meta Stack Exchange
  3. Area 51
  4. Stack Overflow Careers
site design / logo © 2015 Stack Exchange Inc; user contributions licensed under cc by-sa 3.0 with attribution required
rev 2015.12.3.3047
Linux is a registered trademark of Linus Torvalds. UNIX is a registered trademark of The Open Group.
This site is not affiliated with Linus Torvalds or The Open Group in any way.