[LWN Logo]

Date: Sat, 7 Aug 1999 17:41:46 -0600
From: Victor Yodaiken <yodaiken@rtlinux.com>
To: yodaiken@fsmlabs.com
Subject: [rtl] A design document for comment

As some of you may have noticed, we have a commercial venture that is now paying for most
of the core RTLinux development. Now I feel like I need to make more of an effort to 
open the design and development process. Kind of paradoxical, but what the hell. 


--------------------------- 

An explanation of the RTLinux V2 API and  some development notes
(c) Victor Yodaiken 1999. All rights reserved. 

THIS is a set of discussion points and possible plans and nothing
in here is a promise of any sort. Some of this is implemented, some
not.  If you have comments or, better yet, code or test results, I want to know.

With V2 and the Beta5+ betas, RTLinux is 
is moving towards compatibility with the POSIX RealTime 
specification by following the POSIX "single process" model of a
realtime system. The other major changes is a cleaning up of SMP
support and addition of a framework for I/O.
The reasons for adopting POSIX-like APIs are that this is an existing and
not terrible API, many users asked for it, and rather than extending the
original API to deal with the complexities of multiple clocks, cross
cpu operation, I/O, and multiple processor architectures,
we thought it would be more reasonable to 
move to a more powerful API. HOWEVER!!, we were hesitant to adopt
POSIX because two  of the most important features of RTLinux are that it
is simple and fast. We were not willing to compromise these features for
POSIX compatibility, and the POSIX API is enormous and complicated.
Fortunately, POSIX has provided a "minimal" system specification that
is acceptable for our purposes and we have moved towards that standard.

We have made an effort to distinguish between GOOD POSIX and BAD POSIX.
GOOD POSIX is sensible and fits our core API well. BAD POSIX is slow, ugly, and 
an offense against all right thinking. GOOD POSIX will be supported natively.
BAD POSIX will be supported either by add on modules or by modules written
by someone else. The key is that users will not be forced to use BAD POSIX or to
pay a performance penalty for those users  who freely or otherwise
choose to use BAD POSIX.
We also have made an effort to make is relatively easy to emulate the RTL environment
from within a standard Pthreads environment so RTL code can be debugged in user mode
as much as possible.



Specifications in this note
are drawn from P1003.13 Draft 9 of the POSIX "Draft Standard for 
Information Technology -- Standardized Application Profile --
POSIX Realtime Applications Support (AEP)." (IEEE Std P1003.13-1999x)
and from the Single UNIX Specification available from http://www.opengroup.com.

Much of what is commonly understood to the POSIX realtime standard is aimed towards
"soft realtime" operating systems and is not suitable for a hard realtime operating
system like RTLinux.
The POSIX Draft standard identifies, in section 6, a "Minimal Realtime System
Profile" (PSE51) intended for hard realtime systems like RTLinux. 
The "rationale" given is that  "the POSIX.1c Threads model (with all options enabled,
but without a file system) best reflected current industry practice in certain
embedded realtime areas. Instead of full file system support, basic device I/O (read,
write, open, close,  control) is considered sufficient for kernels of this size.
Systems of this size frequently do not include process isolation hardware or 
software; therefore, multiple processes (as opposed to threads) may not be
supported." (Page 36).

RTLinux supports POSIX Pthreads calls via the scheduler module. V2 RTLinux also
adds a new "timer" module. 
The timer module provides interface to _hardware_ timers.
The default scheduler under Beta11 and beyond
provides a model of a single realtime process with multiple
threads, sharing a single address space. The process consists of the scheduler
and some number of realtime threads.  So each scheduler instance is a single
process. In a SMP environment, RTLinux will support multiple schedulers -- one
on each processor. These "processes" do share address space, but are logically
distinct. RTLinux does not provide any support for moving threads from one
"process" to another or for controlling threads in one process from another
process. Such functionality is easily obtained, if needed, by using the
primitives described below.

There is also an under-construction new module called POSIX_io that will
provide a standard POSIX interface for drivers. More on this below.

One of the guiding principles of our design is that POSIX compatibility cannot
be allowed to degrade performance and that POSIX components that are inherently
performance challenged must be optional. That is, if you must use clumsy POSIX
constructs, RTLinux will support you, but you are not required to do so.

The file RTL/include/rtl_sched.h defines the basic types and interfaces for this
model. 

Type: pthread_t is a pthread identifier.  The current implementation defines
pthread_t as a pointer to a thread structure, but this may change later. pthread_t
is intended to be opaque to RTL application programmers. The API is a mix of
POSIX mandated calls and "np" calls that simplify the system.

It is important to understand the concept of "mode" in RTLinux. At any time,
a RTLinux processor can be in "user", "kernel" or "rt" modes. User and kernel modes
are the standard Linux user/kernel modes and when the Linux task is running, the
processor must be in one of those modes. When a RT task or interrupt handler is
running, the processor is in RT mode. In an SMP system, processors change
modes independently. Most of the calls in the RTL API are "RT safe" and 
may be called while in RT mode, but certain calls must only be run in Linux
mode. These are generally calls that can introduce unbounded delays and/or need
Linux kernel resources.

Threads have no clocks. Clocks belong to schedulers.  We can add scheduling
policies in which a scheduler juggles multiple clocks, but there is no advantage
that I can see in allowing a thread to specify its  hardware clock
On the other hand, POSIX "clocks" are not necessarily hardware clocks.

-------------------------------------------------------------------------
-------------------------------------------------------------------------
1. Scheduler API
-------------------------------------------------------------------------

type: pthread_t
/* BASIC PTHREAD CALLS */
int pthread_create (pthread_t *p, pthread_attr_t *a, void *(*start_routine)(void *), void *arg)
void pthread_exit(void *retval);
void pthread_yield(void) NOT IMPLEMENTED
int pthread_setschedparam(pthread_t thread, int policy,
int pthread_getschedparam(pthread_t thread, int *policy,
pthread_t pthread_self(void)

/* _np NON_POSIX calls */
int pthread_delete_np (pthread_t thread);  /*  obvious. The thread must be inactive or this fails */
int pthread_use_fp_np (pthread_t thread, int flag); /* allow this thread to use floating point */
---  POSIX does not provide a clean mechanism for suspending and waking up threads. Instead POSIX
---  wants to use the signal mechanisms for these. These are primitives in RTLinux but can be
---  implemented on top of signals if you want to run in user mode.

int pthread_wakeup_np (pthread_t thread);
int pthread_suspend_np (void);
int pthread_wait_np(void);

-- POSIX does not provide a mechanism for adjusting periods. This reveals a great deal about
-- the non-rt basis of POSIX.

int pthread_setperiod_np(pthread_t p,const struct itimerspec *t);

--- NOTE. RTLinux will ***never** implement the POSIX RR or FIFO scheduling policies as
---    fundamental policies. We will happily add schedulers that implement these policies, but
---    most RTLinux users don't want or need them. Similarly, for other POSIX
---    complex and not useful components.

int sched_get_priority_max(int policy) 
int sched_get_priority_min(int policy) 

------ ATTRIBUTES---------
--- these are optional. If you don't use them, we make decent defaults 

 int pthread_attr_init(pthread_attr_t *attr)
 int pthread_attr_getstacksize(pthread_attr_t *attr, size_t * stacksize)
 int pthread_attr_setstacksize(pthread_attr_t *attr, size_t stacksize)
 int pthread_attr_getcpu(pthread_attr_t *attr, int * cpu)
 int pthread_attr_setschedparam(pthread_attr_t *attr,
 int pthread_attr_getschedparam(const pthread_attr_t *attr,

--- this is a non-POSIX extension to support SMP. If you do not set this specifically,
--- the task is created, by default, on the current processor.
 int pthread_attr_setcpu_np(pthread_attr_t *attr, int cpu)

/* only CLOCK_REALTIME is supported currently */
clock_gettime(clock, tsptr)r
rtl_gettime(tsptr)
	-- these above should  be inline functions so we can give a real prototype

--- non-posix calls not suppported in every architecture.
int rtl_clock_set_oneshot_mode (clockid_t clock);
int rtl_clock_set_periodic_mode (clockid_t clock, struct timespec *period);



NOT IMPLEMENTED.--------------------------------
#define clock_getres (clock, tsptr)  
int timer_create(clockid_t clockid, void /*struct sigevent*/ *evp, timer_t *timerid);
int timer_delete(timer_t timerid);
int timer_settime(timer_t timerid, int flags,
		    const struct itimerspec *value, struct itimerspec *ovalue);
int timer_gettime(timer_t timerid, struct itimerspec *value);
int timer_getoverrun(timer_t timerid);

-----------------------------------------------------------------
-----------------------------------------------------------------
2. TIMER API 
-----------------------------------------------------------------

The timer exports a "hardware clock structure" as a low level interface
for schedulers and other below pthreads modules. The timer also 
defines operations on these clocks in nanoseconds.

struct hw_clock  rtl_best_clock(unsigned int ,RTL_CLOCK_HANDLER );
    This finds the "best" system clock for RT tasks and returns a
    token allowing operations on the clock.

rtl_clock_settimer(hw_clock_t, mode, tsptr) 
rtl_clock_getmode(hw_clock_t)
int rtl_clock_sethandler (hw_clock_t, RTL_CLOCK_HANDLER);
int rtl_gettime (struct timespec *);
int rtl_clock_unsethandler (hw_clock_t );
int rtl_clock_ishandlerset (hw_clock_t );
hw_clock_t rtl_clock_init(RTL_CLOCK_HANDLER );
void rtl_clock_uninit (clockid_t  );
typedef void (* RTL_CLOCK_HANDLER)(struct pt_regs *r);
--- this name "RTL_CLOCK_HANDLER" will change to a more consistent name


--- The file rtl_timer.h also defines a set of arithmetic operations on 
timespecs.

timespec_gt(t1, t2)
timespec_ge(t1, t2)
timespec_le(t1, t2)
timespec_eq(t1, t2) 
timespec_add(t1, t2)
long long timespec_to_ns (struct timespec *ts)
timespec_sub(t1, t2) 
timespec_nz(t) 
timespec_lt(t1, t2) 
struct timespec timespec_from_ns (long long t)


The lowest level operations are via the hw_clock structure itself: 

typedep struct hw_clock * clockid_t;
struct hw_clock {
	int (*init) (struct hw_clock *);
	void (*uninit) (struct hw_clock *);
	int (*gettime) (struct timespec *ts);
	int (*settimer) (int mode, struct timespec *interval);
	RTL_CLOCK_HANDLER handler;
	int istimerset;
	int mode;
	int apic_cpu;
	struct timespec linux_time;
	struct timespec time;
	struct timespec resolution;
};

typedef struct rtl_timer_struct *timer_t;

/* modes */
#define CLOCK_MODE_UNINITIALIZED  -1
#define CLOCK_MODE_ONESHOT  0
#define CLOCK_MODE_PERIODIC 1
--- these will be replaced by enums types

---------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------
3. Commentary on the "rationale"
---------------------------------------------------------------------------------------------
More POSIX to be implemented. This part is a commentary on 6.3 "Rationale". The "rationale" is not
formally part of the specification.  My theory here is that there is an optional 
"posix_compat.h" file for people who really want all this junk.

---------------------------------------------
6.3.1.1 Process Primitives
"most POSIX.1 process primitives to not apply."
but
"Signal services are a basic mechanism with POSIX based systems and are required for error and 
 event handling"

In RTL we intend signals to be constructed from the rtl_mutex primitive and an unspecified mailbox implementation.
kill(p,s) == { put s in mailbox(p); if(rtl_is_waiting_muxtex(p,mailbox(p)->mutex))rtl_mutex_release(mailbox(p)->mutex)

---------------------------------------------
6.3.1.2 Process Environment

Posix requires "sysconf" and "uname" to provide the following functionality

long int sysconf(int name);
"Conforming applications must act as if CHILD_MAX == 0" -- which it will be in RTL.
I have no idea what most of these are.

RG_MAX _SC_ARG_MAX
BC_BASE_MAX _SC_BC_BASE_MAX
BC_DIM_MAX _SC_BC_DIM_MAX
BC_SCALE_MAX _SC_BC_SCALE_MAX
BC_STRING_MAX _SC_BC_STRING_MAX
CHILD_MAX _SC_CHILD_MAX
CLK_TCK _SC_CLK_TCK
COLL_WEIGHTS_MAX _SC_COLL_WEIGHTS_MAX
EXPR_NEST_MAX _SC_EXPR_NEST_MAX
LINE_MAX _SC_LINE_MAX
NGROUPS_MAX _SC_NGROUPS_MAX
OPEN_MAX _SC_OPEN_MAX
PASS_MAX _SC_PASS_MAX (LEGACY)
_POSIX2_C_BIND _SC_2_C_BIND
_POSIX2_C_DEV _SC_2_C_DEV
_POSIX2_C_VERSION _SC_2_C_VERSION
_POSIX2_CHAR_TERM _SC_2_CHAR_TERM
SIX2_FORT_DEV _SC_2_FORT_DEV
_POSIX2_FORT_RUN _SC_2_FORT_RUN
_POSIX2_LOCALEDEF _SC_2_LOCALEDEF
_POSIX2_SW_DEV _SC_2_SW_DEV
_POSIX2_UPE _SC_2_UPE
_POSIX2_VERSION _SC_2_VERSION
_POSIX_JOB_CONTROL _SC_JOB_CONTROL
_POSIX_SAVED_IDS _SC_SAVED_IDS
_POSIX_VERSION _SC_VERSION
RE_DUP_MAX _SC_RE_DUP_MAX
STREAM_MAX _SC_STREAM_MAX
TZNAME_MAX _SC_TZNAME_MAX
_XOPEN_CRYPT _SC_XOPEN_CRYPT
_XOPEN_ENH_I18N _SC_XOPEN_ENH_I18N
_XOPEN_SHM _SC_XOPEN_SHM
_XOPEN_VERSION _SC_XOPEN_VERSION
_XOPEN_XCU_VERSION _SC_XOPEN_XCU_VERSION
_XOPEN_REALTIME _SC_XOPEN_REALTIME
_XOPEN_REALTIME_THREADS _SC_XOPEN_REALTIME_THREADS
_XOPEN_LEGACY _SC_XOPEN_LEGACY
ATEXIT_MAX _SC_ATEXIT_MAX
IOV_MAX _SC_IOV_MAX
PAGESIZE _SC_PAGESIZE
PAGE_SIZE _SC_PAGE_SIZE
_XOPEN_UNIX _SC_XOPEN_UNIX
_XBS5_ILP32_OFF32 _SC_XBS5_ILP32_OFF32
_XBS5_ILP32_OFFBIG _SC_XBS5_ILP32_OFFBIG
_XBS5_LP64_OFF64 _SC_XBS5_LP64_OFF64
_XBS5_LPBIG_OFFBIG _SC_XBS5_LPBIG_OFFBIG
AIO_LISTIO_MAX _SC_AIO_LISTIO_MAX
AIO_MAX _SC_AIO_MAX
AIO_PRIO_DELTA_MAX _SC_AIO_PRIO_DELTA_MAX
DELAYTIMER_MAX _SC_DELAYTIMER_MAX
MQ_OPEN_MAX _SC_MQ_OPEN_MAX
MQ_PRIO_MAX _SC_MQ_PRIO_MAX
RTSIG_MAX _SC_RTSIG_MAX
SEM_NSEMS_MAX _SC_SEM_NSEMS_MAX
SEM_VALUE_MAX _SC_SEM_VALUE_MAX
SIGQUEUE_MAX _SC_SIGQUEUE_MAX
TIMER_MAX _SC_TIMER_MAX
_POSIX_ASYNCHRONOUS_IO _SC_ASYNCHRONOUS_IO
_POSIX_FSYNC _SC_FSYNC
_POSIX_MAPPED_FILES _SC_MAPPED_FILES
_POSIX_MEMLOCK _SC_MEMLOCK
_POSIX_MEMLOCK_RANGE _SC_MEMLOCK_RANGE
_POSIX_MEMORY_PROTECTION _SC_MEMORY_PROTECTION
_POSIX_MESSAGE_PASSING _SC_MESSAGE_PASSING
_POSIX_PRIORITIZED_IO _SC_PRIORITIZED_IO
_POSIX_PRIORITY_SCHEDULING _SC_PRIORITY_SCHEDULING
_POSIX_REALTIME_SIGNALS _SC_REALTIME_SIGNALS
_POSIX_SEMAPHORES _SC_SEMAPHORES
_POSIX_SHARED_MEMORY_OBJECTS _SC_SHARED_MEMORY_OBJECTS
_POSIX_SYNCHRONIZED_IO _SC_SYNCHRONIZED_IO
_POSIX_TIMERS _SC_TIMERS
Maximum size of and data buffers _SC_GETGR_R_SIZE_MAX
Maximum size of and data buffers _SC_GETPW_R_SIZE_MAX
LOGIN_NAME_MAX _SC_LOGIN_NAME_MAX
PTHREAD_DESTRUCTOR_ITERATIONS _SC_THREAD_DESTRUCTOR_ITERATIONS
PTHREAD_KEYS_MAX _SC_THREAD_KEYS_MAX
PTHREAD_STACK_MIN _SC_THREAD_STACK_MIN
PTHREAD_THREADS_MAX _SC_THREAD_THREADS_MAX
TTY_NAME_MAX _SC_TTY_NAME_MAX
_POSIX_THREADS _SC_THREADS
_POSIX_THREAD_ATTR_STACKADDR _SC_THREAD_ATTR_STACKADDR
_POSIX_THREAD_ATTR_STACKSIZE _SC_THREAD_ATTR_STACKSIZE
_POSIX_THREAD_PRIORITY_SCHEDULING _SC_THREAD_PRIORITY_SCHEDULING
_POSIX_THREAD_PRIO_INHERIT _SC_THREAD_PRIO_INHERIT
_POSIX_THREAD_PRIO_PROTECT _SC_THREAD_PRIO_PROTECT
_POSIX_THREAD_PROCESS_SHARED _SC_THREAD_PROCESS_SHARED
_POSIX_THREAD_SAFE_FUNCTIONS _SC_THREAD_SAFE_FUNCTIONS


--- end POSIX quote

-- Perhaps uname can simply read the Linux data. It seems harmless.

------------------------------------------------------------------
6.3.1.3 Files and directories

--- RTLinux will have a posixio module to give a basic framework for drivers.


"The open function is needed to do basic device I/O, also to provide device initialization."
"Although this requires some form of name resolution, a full pathname is specifically not
required. Directories are also not required."

open(s,mode) == if(s[1] != 0 ) return -EINVAL; return symbolic_name_to_fifo_number(s[0]);
------------------------------------------------------------------
6.3.1.4 Input and output

"The functions _read_ _write_ , and _close_ are required to do basic i/O and device cleanup."

------------------------------------------------------------------
6.3.1.5 Device and class specific functions 

"POSIX.1 Device of Class-Specific functions are not required.

------------------------------------------------------------------
6.3.1.6  Language specific

C must be supported.

------------------------------------------------------------------
6.3.1.7  System databases

"Implementations are not required to support more than one user and group id ... No POSIX.1 system database
functions are required."

------------------------------------------------------------------
------------------------------------------------------------------
6.3.2 POSIX 1b requirements

TROUBLE!

------------------------------------------------------------------
6.3.2.1 Realtime signals


--- My grand plan for signal.
    1. hate em
    2. despise em
    3. implement them via brute force. There will be a posix_sig
       module that does the following on a kill
       A. lock everything
       B. if the target is ignoring, return
       B. go to the scheduler machine dependent saved state of the
          target and patch it to jump to handling code
       C. unsuspend  target
       D. done


Got a better idea? The price for this stupidity must be paid by
the task that asks to deliver/receive signals and can't 
further bloat our poor little scheduler.

Basically async operations are high overhead and not useful in a RT context. 






n
--- [rtl] ---
To unsubscribe:
echo "unsubscribe rtl" | mail majordomo@rtlinux.cs.nmt.edu OR
echo "unsubscribe rtl <Your_email>" | mail majordomo@rtlinux.cs.nmt.edu
----
For more information on Real-Time Linux see:
http://www.rtlinux.org/~rtlinux/