[LWN Logo]

Date:   Thu, 28 Oct 1999 15:14:24 -0500 (EST)
From:   Jim Blandy <jimb@cygnus.com>
To:     linux-kernel@vger.rutgers.edu
Subject: Debugging support for Pentium Streaming SIMD Extensions


Intel has contracted with Cygnus to add support for the Streaming
SIMD Extensions in GCC, GDB, and so on.  I've been working on the GDB
side of things.  I started with Doug Ledford's patches against 2.2.9,
adapted them to 2.2.12, and extended ptrace to provide access to the
SSE registers.  They are available from:

   http://sourceware.cygnus.com/gdb/papers/linux

The next GDB snapshot, available on Monday or Tuesday, will include
some support for examining SSE registers, etc.  GDB snapshots are
available from a link on this page:

   http://sourceware.cygnus.com/gdb

The patches here are preliminary, and I'm interested in hearing folks'
comments on them.  We're going to give our final version of these
patches to Intel, and Intel is going to announce them in some way, and
publicly contribute them to Linux.  So we would very much like to get
the patches in a form the kernel maintainers would actually approve
of.


Points of interest:

* I've added two new requests to ptrace: PTRACE_{GET,SET}XFPREGS
  read and write a process's i387_hard_fxsave structure.  These new
  requests are parallel to the existing PTRACE_{GET,SET}{,FP}REGS
  requests.

  I think these changes are backward-compatible in the right ways.  If
  the processor doesn't support SSE, or the kernel was configured
  without SSE support, then ptrace returns EIO, just as an older
  kernel would if presented with an unrecognized ptrace request.
  GDB adapts to old and new kernels just fine, and old GDB's should
  continue to work unchanged.

  I've submitted corresponding patches to GLIBC to add the structure
  and ptrace request to <sys/ptrace.h> and <sys/user.h>.

* The patch doesn't extend core dumps to include the SSE registers
  yet.  I assume that the best way to do this is to add a new note,
  with some new type name like "NT_PRXFPREG", containing an
  i387_hard_fxsave structure.  This would be in addition to an
  old-style NT_PRFPREG note, so all existing tools would continue to
  work.  Naturally, this note would only be dumped if the kernel and
  processor supported it.

* To save a process's context, these patches do both an fnstenv and
  an fxsave, even though fxsave saves all the context necessary.
  Thus, context switches are slower than they need to be.

  The rationale for the extra save, I think, is that several external
  interfaces --- signal contexts, core dumps, and ptrace --- use the
  old FP save format, but reconstructing an fstenv record from an
  fxsave record is non-trivial (although possible).  By saving the
  FP/SSE state in both formats, it's easy to retain compatibility with
  those old interfaces.

  I think Linux should simply use fxsave and fxrstor for context
  switches, and convert between that and the fnstenv format when
  needed.  The conversion is quick, but not as quick as a memcpy; see
  the description of the FXSAVE instruction for details.

  However, I don't think the cost is important.  These conversions
  would occur in three places:

  - When dumping core.  The cost of doing a single conversion is
    inconsequential next to the cost of writing the core file.

  - When a process makes a PTRACE_{GET,SET}FPREGS request.  New code
    can use PTRACE_{GET,SET}XFPREGS instead, so this is also not a
    problem.

  - When the kernel delivers a signal, or restores a signal context.
    A conversion is necessary if the kernel wants to continue to use
    the existing struct _fpstate.  However, I believe that struct
    _fpstate must be expanded to include the XMM registers as well.
    If we do this, we might as well change it to use the new format at
    the same time, so no conversion will be necessary.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/