I'm fishing for comments. The current API gives away too much from the POV of a compiler-writer.
And I need to be clear on one point -- if we don't resolve this, there's well-understood optimizations that Go compilers won't do, and these are sufficiently important optimizations that they affect the way people write programs.
Current (1.<=5) Go only inlines when it can inline entire call trees all the way to the leaves; it cannot, for example, inline a simple function Small that wraps a much larger function Huge, because Huge is not eligible for inlining and thus all callers of Huge are not eligible for inlining. The current total-inlining policies mean that Go programs never observe the call sites at which the inlining occurs, and thus are never presented with the dilemma that there can be a stack of "callers" corresponding to a single return PC.
If we allow more flexible inlining, and if A calls B, B calls runtime.Callers(1, pcslice) and B is inlined into A, the "return pc" for Callers (pcslice) will be an address within A (according to nm) and there will be no return PC identifying B or the file of B or the line number within B.
And unfortunately, this return PC = single caller identity is exposed in an interface -- a "return pc" is a uintptr. We could in theory hack on that integer to embed inline depth information in some unused bits, but in practice that is likely to break some code.
So, if we intend to improve inlining in the future, we need to do something about this. The options I see (and I may be myopic, other options are welcome) include:
(a) simply ignore intermediate callers. In the example, the caller of Callers will be A, and any mention of B is lost. It's a systems programming language, optimizers do this stuff all the time in other languages and you love it when they do, put on your grown-up pants and deal with it.
And it will prepare people for tail-call elimination.
And it will discourage them from thinking that they can use this for implementing Java-style caller-sensitive security, which is slow, yet difficult to reason about and hard to maintain. Seriously, this has been a source of numerous security holes in Java, in JDK 8 they simplified it as much as possible to reduce their risk (see e.g. http://openjdk.java.net/jeps/176 and https://bugs.openjdk.java.net/browse/JDK-8046166 ).
cons: people might already critically rely on exact stack traces.
(b) hack on the encoding of the return PC to embed an index. I'd propose to grab the upper 2-3 bits on an Intel box or the lower 2 bits on a RISC architecture (this imposes an obvious restriction on inline depth, and I assume that I can grab these bits even on a 32-bit machine), and have the counter indicate the inline depth, thus the 00 case will match what is seen in the nm output (in the above example, "A") and the 01 case corresponds to B. The symbol table information for "A" will need to provide the inlining depth at this site. In the example above, if it is IA32 and the raw RPC is 0x01234567, then the caller of Callers (B) is return PC 0x41234567 and B's caller ("A") is return PC 0x01234567. On a RISC, raw PC 0x03217654 is the same as the RPC for "A" and 0x03217655 is the RPC for "B" (called by A).
pros: no change to interface, we never promised that "returnpc" was good for anything, did we?
cons: people thought we promised "returnpc" was a real live returnpc, and acted accordingly.
Fails to adequately discourage Java-style caller-sensitive security.
Also fails to prepare people for tail-call elimination.
(c) = (a) + extend the interface with a more generalized notion of program counter. Most old code will continue to sort-of work without change and no risk of gigantic surprise (e.g., illegal address dereference from believing that an option-b PC was a real address), new code will use the improved interface.
Note that there's nothing novel about abstracting on the "return PC" to get something that allows you to talk precisely about backtraces in the presence of inlining -- Java implementations have been doing this for more than 15 years.