Migrating to QNX Neutrino for ARMv6-Processor-Based Boards

Overview

This technote provides the details about the processes you should use when setting up QNX Neutrino for boards that support the ARMv6 processor.

You must use the procnto-v6, the QNX Neutrino microkernel for an ARMv6 processor.

The support for ARMv6 architecture processors (ARM11, OMAP2) is now provided by:

The procnto-v6 microkernel takes advantage of the ARMv6 MMU's physically-tagged cache to remove the 32 MB address space restriction imposed by the previous ARM MMU architecture. Note the following:

The default configuration for qcc still causes the compiler to generate ARMv4 instructions. This ensures that all code compiled with this configuration can run on any supported ARM processor.

Additional compiler flags are required to instruct the compiler to generate ARMv6 instructions. Using such code on non-ARMv6 processors may cause undefined instruction exceptions (generating a SIGILL signal).

BSP configuration

The libstartup CPU detection and configuration process includes the following changes since the 6.3.0 release:

armv_chip

Purpose: The armv_chip structure describes the configuration for a particular CPU.

cpuid
Contains bits 15:0 of the CP15 main ID register.

The armv_list[] array defined in armv_list.c contains a list of all supported CPUs, and the arm_chip_detect() function iterates through this array to match bits 15:0 of the ID register.

A BSP can override the library's armv_list.c to provide a customized list of supported CPUs, for example to specify armv_chip structures that aren't implemented in libstartup, or to restrict the list to the processor(s) implemented by the target board.

name
The textual name of the processor.
mmu_cr_set
Specifies which bits to set in the CP15 MMU control register when the MMU is enabled in vstart().
mmu_cr_clr
Specifies which bits to clear in the CP15 MMU control register when the MMU is enabled in vstart().
cycles
The number of CPU cycles taken by the arm_cpuspeed.c calibration loop.
cache
A pointer to an armv_cache structure describing the cache configuration.
power
A pointer to the CPU-specific power callout.

If no power callout is specified, the kernel's idle loop simply busy-loops, and the sysmgr_cpumode() call fails with ENOSYS.

flush and deferred
Pointers to the CPU-specific callouts used by procnto to handle unmapping pages.

The flush callout is used to flush the cache and TLB when unmapping a page. This is called for each page in a region being unmapped.

The deferred callout is used after all pages in a region have been unmapped, and can be used to perform any actions that were not performed by the flush callout.

For example, if the MMU doesn't support flushing the instruction cache by virtual address, the deferred callout can be used to flush the instruction cache after all pages have been unmapped to reduce the cost of flushing.

pte
A pointer to the default page table configuration
pte_wa
A pointer to the page table configuration for write-allocate cache behavior.

If you specify the -wa option, the pte_wa configuration is used. If the CPU does not support write-allocate caching, set this to 0, and the default pte values will be used instead.

pte_wb
A pointer to the page table configuration for write-back cache behavior.

If you specify the -wb option, the pte_wb configuration is used. If the CPU doesn't support write-back caching, set this to 0, and the default pte values will be used instead.

pte_wt
A pointer to the page table configuration for write-through cache behavior.

If you specify the -wt option, the pte_wt configuration is used. If the CPU doesn't support write-through caching, set this to 0, and the default pte values will be used instead.

setup
Point to a function that performs additional CPU-specific initialization.

armv_cache

Purpose: The armv_cache structure describes the CPU caches.

dcache_config
Describes the data cache. This is required only if the CPU doesn't implement the CP15 cache-type register.

If the CPU does implement the CP15 cache-type register, set this to 0, so that the startup library will use arm_add_cache() to determine the cache register configuration based on the CP15 cache-type register.

dcache_rtn
Manage the data cache with the help of a callout.
icache_config
Describes the instruction cache. This is required only if the CPU doesn't implement the CP15 cache type register. If the CPU does implement the CP15 cache-type register, set this to 0, so that the startup library will use arm_add_cache() to determine the cache register configuration based on the CP15 cache-type register.
icache_rtn
Manage the instruction cache with the help of a callout.

armv_pte

Purpose: The armv_pte structure describes the MMU page table encodings.

upte_ro
>User mode read-only pages.
upte_rw
User mode read-write pages.
kpte_ro
Kernel mode read-only pages.
kpte_rw
Encoding for kernel mode read-write pages.
mask_nc
Non-cacheable mappings.
l1_pgtable
L2 page table pointer with L1 descriptor.
kscn_ro>
Kernel mode L1 read-only section mapping.
kscn_rw
Kernel mode L1 read-write section mapping.
kscn_cb
Cacheable section mapping.

setup()

Purpose: The setup() function performs any CPU-specific initialization.

For ARMv6, there is a generic function, armv_setup_v6(), that performs generic ARMv6 initialization:

The armv_setup_v6() function must be called by any CPU-specific setup function for an ARMv6 CPU after it has performed its CPU-specific actions.

Behavior of procnto-v6 shm_ctl()

The ARMv6 procnto-v6 removes the 32 MB process address space limit:

The procnto-v6 microkernel doesn't implement the ARM-specific global memory region implemented by the non-ARMv6 procnto. This means that shm_ctl() no longer has any ARM-specific special behavior. The shm_ctl() function exhibits the following:

If code must run on both ARMv6 and non-ARMv6 processors, you must check the __cpu_flags value at runtime to select the correct implementation. For example:

if (__cpu_flags & ARM_CPU_FLAG_V6) {
   /*
   * Code for ARMv6 processor only
   */
   } else {
     /*
     * Code for non-ARMv6 processor only
     */
     }

Using ARMv6 instructions

By default, qcc provides only ARMv4 instructions. This ensures that all compiled code will run on any supported ARM processor.

The ARMv6 processor introduces a number of new instructions that may provide performance benefits for certain code. For example, DSP algorithms can take advantage of the new media instructions.

This requires the correct gcc and binutils versions that implement ARMv6 migration:

There are a number of ways you can optimize ARMv6 operations:


Caution: The object files, libraries, and binaries that are compiled to use ARMv6 instructions can only run on a target with an ARMv6 CPU. On a non-ARMv6 CPU, this causes an undefined instruction exception (SIGILL signal) or may result in unpredictable behavior.