Date: Sat, 7 Aug 1999 17:41:46 -0600 From: Victor Yodaiken <yodaiken@rtlinux.com> To: yodaiken@fsmlabs.com Subject: [rtl] A design document for comment As some of you may have noticed, we have a commercial venture that is now paying for most of the core RTLinux development. Now I feel like I need to make more of an effort to open the design and development process. Kind of paradoxical, but what the hell. --------------------------- An explanation of the RTLinux V2 API and some development notes (c) Victor Yodaiken 1999. All rights reserved. THIS is a set of discussion points and possible plans and nothing in here is a promise of any sort. Some of this is implemented, some not. If you have comments or, better yet, code or test results, I want to know. With V2 and the Beta5+ betas, RTLinux is is moving towards compatibility with the POSIX RealTime specification by following the POSIX "single process" model of a realtime system. The other major changes is a cleaning up of SMP support and addition of a framework for I/O. The reasons for adopting POSIX-like APIs are that this is an existing and not terrible API, many users asked for it, and rather than extending the original API to deal with the complexities of multiple clocks, cross cpu operation, I/O, and multiple processor architectures, we thought it would be more reasonable to move to a more powerful API. HOWEVER!!, we were hesitant to adopt POSIX because two of the most important features of RTLinux are that it is simple and fast. We were not willing to compromise these features for POSIX compatibility, and the POSIX API is enormous and complicated. Fortunately, POSIX has provided a "minimal" system specification that is acceptable for our purposes and we have moved towards that standard. We have made an effort to distinguish between GOOD POSIX and BAD POSIX. GOOD POSIX is sensible and fits our core API well. BAD POSIX is slow, ugly, and an offense against all right thinking. GOOD POSIX will be supported natively. BAD POSIX will be supported either by add on modules or by modules written by someone else. The key is that users will not be forced to use BAD POSIX or to pay a performance penalty for those users who freely or otherwise choose to use BAD POSIX. We also have made an effort to make is relatively easy to emulate the RTL environment from within a standard Pthreads environment so RTL code can be debugged in user mode as much as possible. Specifications in this note are drawn from P1003.13 Draft 9 of the POSIX "Draft Standard for Information Technology -- Standardized Application Profile -- POSIX Realtime Applications Support (AEP)." (IEEE Std P1003.13-1999x) and from the Single UNIX Specification available from http://www.opengroup.com. Much of what is commonly understood to the POSIX realtime standard is aimed towards "soft realtime" operating systems and is not suitable for a hard realtime operating system like RTLinux. The POSIX Draft standard identifies, in section 6, a "Minimal Realtime System Profile" (PSE51) intended for hard realtime systems like RTLinux. The "rationale" given is that "the POSIX.1c Threads model (with all options enabled, but without a file system) best reflected current industry practice in certain embedded realtime areas. Instead of full file system support, basic device I/O (read, write, open, close, control) is considered sufficient for kernels of this size. Systems of this size frequently do not include process isolation hardware or software; therefore, multiple processes (as opposed to threads) may not be supported." (Page 36). RTLinux supports POSIX Pthreads calls via the scheduler module. V2 RTLinux also adds a new "timer" module. The timer module provides interface to _hardware_ timers. The default scheduler under Beta11 and beyond provides a model of a single realtime process with multiple threads, sharing a single address space. The process consists of the scheduler and some number of realtime threads. So each scheduler instance is a single process. In a SMP environment, RTLinux will support multiple schedulers -- one on each processor. These "processes" do share address space, but are logically distinct. RTLinux does not provide any support for moving threads from one "process" to another or for controlling threads in one process from another process. Such functionality is easily obtained, if needed, by using the primitives described below. There is also an under-construction new module called POSIX_io that will provide a standard POSIX interface for drivers. More on this below. One of the guiding principles of our design is that POSIX compatibility cannot be allowed to degrade performance and that POSIX components that are inherently performance challenged must be optional. That is, if you must use clumsy POSIX constructs, RTLinux will support you, but you are not required to do so. The file RTL/include/rtl_sched.h defines the basic types and interfaces for this model. Type: pthread_t is a pthread identifier. The current implementation defines pthread_t as a pointer to a thread structure, but this may change later. pthread_t is intended to be opaque to RTL application programmers. The API is a mix of POSIX mandated calls and "np" calls that simplify the system. It is important to understand the concept of "mode" in RTLinux. At any time, a RTLinux processor can be in "user", "kernel" or "rt" modes. User and kernel modes are the standard Linux user/kernel modes and when the Linux task is running, the processor must be in one of those modes. When a RT task or interrupt handler is running, the processor is in RT mode. In an SMP system, processors change modes independently. Most of the calls in the RTL API are "RT safe" and may be called while in RT mode, but certain calls must only be run in Linux mode. These are generally calls that can introduce unbounded delays and/or need Linux kernel resources. Threads have no clocks. Clocks belong to schedulers. We can add scheduling policies in which a scheduler juggles multiple clocks, but there is no advantage that I can see in allowing a thread to specify its hardware clock On the other hand, POSIX "clocks" are not necessarily hardware clocks. ------------------------------------------------------------------------- ------------------------------------------------------------------------- 1. Scheduler API ------------------------------------------------------------------------- type: pthread_t /* BASIC PTHREAD CALLS */ int pthread_create (pthread_t *p, pthread_attr_t *a, void *(*start_routine)(void *), void *arg) void pthread_exit(void *retval); void pthread_yield(void) NOT IMPLEMENTED int pthread_setschedparam(pthread_t thread, int policy, int pthread_getschedparam(pthread_t thread, int *policy, pthread_t pthread_self(void) /* _np NON_POSIX calls */ int pthread_delete_np (pthread_t thread); /* obvious. The thread must be inactive or this fails */ int pthread_use_fp_np (pthread_t thread, int flag); /* allow this thread to use floating point */ --- POSIX does not provide a clean mechanism for suspending and waking up threads. Instead POSIX --- wants to use the signal mechanisms for these. These are primitives in RTLinux but can be --- implemented on top of signals if you want to run in user mode. int pthread_wakeup_np (pthread_t thread); int pthread_suspend_np (void); int pthread_wait_np(void); -- POSIX does not provide a mechanism for adjusting periods. This reveals a great deal about -- the non-rt basis of POSIX. int pthread_setperiod_np(pthread_t p,const struct itimerspec *t); --- NOTE. RTLinux will ***never** implement the POSIX RR or FIFO scheduling policies as --- fundamental policies. We will happily add schedulers that implement these policies, but --- most RTLinux users don't want or need them. Similarly, for other POSIX --- complex and not useful components. int sched_get_priority_max(int policy) int sched_get_priority_min(int policy) ------ ATTRIBUTES--------- --- these are optional. If you don't use them, we make decent defaults int pthread_attr_init(pthread_attr_t *attr) int pthread_attr_getstacksize(pthread_attr_t *attr, size_t * stacksize) int pthread_attr_setstacksize(pthread_attr_t *attr, size_t stacksize) int pthread_attr_getcpu(pthread_attr_t *attr, int * cpu) int pthread_attr_setschedparam(pthread_attr_t *attr, int pthread_attr_getschedparam(const pthread_attr_t *attr, --- this is a non-POSIX extension to support SMP. If you do not set this specifically, --- the task is created, by default, on the current processor. int pthread_attr_setcpu_np(pthread_attr_t *attr, int cpu) /* only CLOCK_REALTIME is supported currently */ clock_gettime(clock, tsptr)r rtl_gettime(tsptr) -- these above should be inline functions so we can give a real prototype --- non-posix calls not suppported in every architecture. int rtl_clock_set_oneshot_mode (clockid_t clock); int rtl_clock_set_periodic_mode (clockid_t clock, struct timespec *period); NOT IMPLEMENTED.-------------------------------- #define clock_getres (clock, tsptr) int timer_create(clockid_t clockid, void /*struct sigevent*/ *evp, timer_t *timerid); int timer_delete(timer_t timerid); int timer_settime(timer_t timerid, int flags, const struct itimerspec *value, struct itimerspec *ovalue); int timer_gettime(timer_t timerid, struct itimerspec *value); int timer_getoverrun(timer_t timerid); ----------------------------------------------------------------- ----------------------------------------------------------------- 2. TIMER API ----------------------------------------------------------------- The timer exports a "hardware clock structure" as a low level interface for schedulers and other below pthreads modules. The timer also defines operations on these clocks in nanoseconds. struct hw_clock rtl_best_clock(unsigned int ,RTL_CLOCK_HANDLER ); This finds the "best" system clock for RT tasks and returns a token allowing operations on the clock. rtl_clock_settimer(hw_clock_t, mode, tsptr) rtl_clock_getmode(hw_clock_t) int rtl_clock_sethandler (hw_clock_t, RTL_CLOCK_HANDLER); int rtl_gettime (struct timespec *); int rtl_clock_unsethandler (hw_clock_t ); int rtl_clock_ishandlerset (hw_clock_t ); hw_clock_t rtl_clock_init(RTL_CLOCK_HANDLER ); void rtl_clock_uninit (clockid_t ); typedef void (* RTL_CLOCK_HANDLER)(struct pt_regs *r); --- this name "RTL_CLOCK_HANDLER" will change to a more consistent name --- The file rtl_timer.h also defines a set of arithmetic operations on timespecs. timespec_gt(t1, t2) timespec_ge(t1, t2) timespec_le(t1, t2) timespec_eq(t1, t2) timespec_add(t1, t2) long long timespec_to_ns (struct timespec *ts) timespec_sub(t1, t2) timespec_nz(t) timespec_lt(t1, t2) struct timespec timespec_from_ns (long long t) The lowest level operations are via the hw_clock structure itself: typedep struct hw_clock * clockid_t; struct hw_clock { int (*init) (struct hw_clock *); void (*uninit) (struct hw_clock *); int (*gettime) (struct timespec *ts); int (*settimer) (int mode, struct timespec *interval); RTL_CLOCK_HANDLER handler; int istimerset; int mode; int apic_cpu; struct timespec linux_time; struct timespec time; struct timespec resolution; }; typedef struct rtl_timer_struct *timer_t; /* modes */ #define CLOCK_MODE_UNINITIALIZED -1 #define CLOCK_MODE_ONESHOT 0 #define CLOCK_MODE_PERIODIC 1 --- these will be replaced by enums types --------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------- 3. Commentary on the "rationale" --------------------------------------------------------------------------------------------- More POSIX to be implemented. This part is a commentary on 6.3 "Rationale". The "rationale" is not formally part of the specification. My theory here is that there is an optional "posix_compat.h" file for people who really want all this junk. --------------------------------------------- 6.3.1.1 Process Primitives "most POSIX.1 process primitives to not apply." but "Signal services are a basic mechanism with POSIX based systems and are required for error and event handling" In RTL we intend signals to be constructed from the rtl_mutex primitive and an unspecified mailbox implementation. kill(p,s) == { put s in mailbox(p); if(rtl_is_waiting_muxtex(p,mailbox(p)->mutex))rtl_mutex_release(mailbox(p)->mutex) --------------------------------------------- 6.3.1.2 Process Environment Posix requires "sysconf" and "uname" to provide the following functionality long int sysconf(int name); "Conforming applications must act as if CHILD_MAX == 0" -- which it will be in RTL. I have no idea what most of these are. RG_MAX _SC_ARG_MAX BC_BASE_MAX _SC_BC_BASE_MAX BC_DIM_MAX _SC_BC_DIM_MAX BC_SCALE_MAX _SC_BC_SCALE_MAX BC_STRING_MAX _SC_BC_STRING_MAX CHILD_MAX _SC_CHILD_MAX CLK_TCK _SC_CLK_TCK COLL_WEIGHTS_MAX _SC_COLL_WEIGHTS_MAX EXPR_NEST_MAX _SC_EXPR_NEST_MAX LINE_MAX _SC_LINE_MAX NGROUPS_MAX _SC_NGROUPS_MAX OPEN_MAX _SC_OPEN_MAX PASS_MAX _SC_PASS_MAX (LEGACY) _POSIX2_C_BIND _SC_2_C_BIND _POSIX2_C_DEV _SC_2_C_DEV _POSIX2_C_VERSION _SC_2_C_VERSION _POSIX2_CHAR_TERM _SC_2_CHAR_TERM SIX2_FORT_DEV _SC_2_FORT_DEV _POSIX2_FORT_RUN _SC_2_FORT_RUN _POSIX2_LOCALEDEF _SC_2_LOCALEDEF _POSIX2_SW_DEV _SC_2_SW_DEV _POSIX2_UPE _SC_2_UPE _POSIX2_VERSION _SC_2_VERSION _POSIX_JOB_CONTROL _SC_JOB_CONTROL _POSIX_SAVED_IDS _SC_SAVED_IDS _POSIX_VERSION _SC_VERSION RE_DUP_MAX _SC_RE_DUP_MAX STREAM_MAX _SC_STREAM_MAX TZNAME_MAX _SC_TZNAME_MAX _XOPEN_CRYPT _SC_XOPEN_CRYPT _XOPEN_ENH_I18N _SC_XOPEN_ENH_I18N _XOPEN_SHM _SC_XOPEN_SHM _XOPEN_VERSION _SC_XOPEN_VERSION _XOPEN_XCU_VERSION _SC_XOPEN_XCU_VERSION _XOPEN_REALTIME _SC_XOPEN_REALTIME _XOPEN_REALTIME_THREADS _SC_XOPEN_REALTIME_THREADS _XOPEN_LEGACY _SC_XOPEN_LEGACY ATEXIT_MAX _SC_ATEXIT_MAX IOV_MAX _SC_IOV_MAX PAGESIZE _SC_PAGESIZE PAGE_SIZE _SC_PAGE_SIZE _XOPEN_UNIX _SC_XOPEN_UNIX _XBS5_ILP32_OFF32 _SC_XBS5_ILP32_OFF32 _XBS5_ILP32_OFFBIG _SC_XBS5_ILP32_OFFBIG _XBS5_LP64_OFF64 _SC_XBS5_LP64_OFF64 _XBS5_LPBIG_OFFBIG _SC_XBS5_LPBIG_OFFBIG AIO_LISTIO_MAX _SC_AIO_LISTIO_MAX AIO_MAX _SC_AIO_MAX AIO_PRIO_DELTA_MAX _SC_AIO_PRIO_DELTA_MAX DELAYTIMER_MAX _SC_DELAYTIMER_MAX MQ_OPEN_MAX _SC_MQ_OPEN_MAX MQ_PRIO_MAX _SC_MQ_PRIO_MAX RTSIG_MAX _SC_RTSIG_MAX SEM_NSEMS_MAX _SC_SEM_NSEMS_MAX SEM_VALUE_MAX _SC_SEM_VALUE_MAX SIGQUEUE_MAX _SC_SIGQUEUE_MAX TIMER_MAX _SC_TIMER_MAX _POSIX_ASYNCHRONOUS_IO _SC_ASYNCHRONOUS_IO _POSIX_FSYNC _SC_FSYNC _POSIX_MAPPED_FILES _SC_MAPPED_FILES _POSIX_MEMLOCK _SC_MEMLOCK _POSIX_MEMLOCK_RANGE _SC_MEMLOCK_RANGE _POSIX_MEMORY_PROTECTION _SC_MEMORY_PROTECTION _POSIX_MESSAGE_PASSING _SC_MESSAGE_PASSING _POSIX_PRIORITIZED_IO _SC_PRIORITIZED_IO _POSIX_PRIORITY_SCHEDULING _SC_PRIORITY_SCHEDULING _POSIX_REALTIME_SIGNALS _SC_REALTIME_SIGNALS _POSIX_SEMAPHORES _SC_SEMAPHORES _POSIX_SHARED_MEMORY_OBJECTS _SC_SHARED_MEMORY_OBJECTS _POSIX_SYNCHRONIZED_IO _SC_SYNCHRONIZED_IO _POSIX_TIMERS _SC_TIMERS Maximum size of and data buffers _SC_GETGR_R_SIZE_MAX Maximum size of and data buffers _SC_GETPW_R_SIZE_MAX LOGIN_NAME_MAX _SC_LOGIN_NAME_MAX PTHREAD_DESTRUCTOR_ITERATIONS _SC_THREAD_DESTRUCTOR_ITERATIONS PTHREAD_KEYS_MAX _SC_THREAD_KEYS_MAX PTHREAD_STACK_MIN _SC_THREAD_STACK_MIN PTHREAD_THREADS_MAX _SC_THREAD_THREADS_MAX TTY_NAME_MAX _SC_TTY_NAME_MAX _POSIX_THREADS _SC_THREADS _POSIX_THREAD_ATTR_STACKADDR _SC_THREAD_ATTR_STACKADDR _POSIX_THREAD_ATTR_STACKSIZE _SC_THREAD_ATTR_STACKSIZE _POSIX_THREAD_PRIORITY_SCHEDULING _SC_THREAD_PRIORITY_SCHEDULING _POSIX_THREAD_PRIO_INHERIT _SC_THREAD_PRIO_INHERIT _POSIX_THREAD_PRIO_PROTECT _SC_THREAD_PRIO_PROTECT _POSIX_THREAD_PROCESS_SHARED _SC_THREAD_PROCESS_SHARED _POSIX_THREAD_SAFE_FUNCTIONS _SC_THREAD_SAFE_FUNCTIONS --- end POSIX quote -- Perhaps uname can simply read the Linux data. It seems harmless. ------------------------------------------------------------------ 6.3.1.3 Files and directories --- RTLinux will have a posixio module to give a basic framework for drivers. "The open function is needed to do basic device I/O, also to provide device initialization." "Although this requires some form of name resolution, a full pathname is specifically not required. Directories are also not required." open(s,mode) == if(s[1] != 0 ) return -EINVAL; return symbolic_name_to_fifo_number(s[0]); ------------------------------------------------------------------ 6.3.1.4 Input and output "The functions _read_ _write_ , and _close_ are required to do basic i/O and device cleanup." ------------------------------------------------------------------ 6.3.1.5 Device and class specific functions "POSIX.1 Device of Class-Specific functions are not required. ------------------------------------------------------------------ 6.3.1.6 Language specific C must be supported. ------------------------------------------------------------------ 6.3.1.7 System databases "Implementations are not required to support more than one user and group id ... No POSIX.1 system database functions are required." ------------------------------------------------------------------ ------------------------------------------------------------------ 6.3.2 POSIX 1b requirements TROUBLE! ------------------------------------------------------------------ 6.3.2.1 Realtime signals --- My grand plan for signal. 1. hate em 2. despise em 3. implement them via brute force. There will be a posix_sig module that does the following on a kill A. lock everything B. if the target is ignoring, return B. go to the scheduler machine dependent saved state of the target and patch it to jump to handling code C. unsuspend target D. done Got a better idea? The price for this stupidity must be paid by the task that asks to deliver/receive signals and can't further bloat our poor little scheduler. Basically async operations are high overhead and not useful in a RT context. n --- [rtl] --- To unsubscribe: echo "unsubscribe rtl" | mail majordomo@rtlinux.cs.nmt.edu OR echo "unsubscribe rtl <Your_email>" | mail majordomo@rtlinux.cs.nmt.edu ---- For more information on Real-Time Linux see: http://www.rtlinux.org/~rtlinux/