proj-jasper-jasperLowEndTargets

Wouldn't it be interesting if Jasper could be a good choice for embedded systems too? This is not one of my main goals, but Lua is inspiring in that it seems it started out just trying to a small, portable, simple, language suitable for embedding into host applications, and it turned out to be suitable for running on low-end hardware too.

Jasper has similar, if different, goals, so maybe we can think about targetting low-end hardware too. Our similar goals are: simplicity; embeddability into host applications, because it's hard to start using a new language if you have to write a project entirely in that language, but easier if you can start by adding new functionality to an existing project in the new language. Unlike Lua, we are not as concerned with performance, however.

Survey of potential low-end hardware targets

PIC AVR ARM Intel Quark

ARM L1 cache

486 had 8K of L1 cache

Rasberry Pi

Uses an ARM1176JZF-S processor, which was also used by the iPod Touch and many smartphones. (Broadcom BCM2835 SoC? with a GPU) at 700Mhz. The ARM1176JZF-S can have L1 cache configured from 4k to 64k.

This book claims that the Pi's L1 instruction cache is 16k and the L1 data cache is 16k.

Intel Quark

Intel Quark SoC? X1000

16 Kbyte shared instruction and data L1 cache.

"The SoC? also features a 512 Kbyte on-die embedded SRAM (eSRAM) that can be configured to overlay regions of DRAM to provide low latency access to critical portions of system memory. For robustness, the contents of this on-die eSRAM are also ECC protected. "

Total memory size from 128 Mbyte to 2 Gbyte

NEST thermostat

ST Microelectronics STM32L151VB ultra-low-power 32 MHz ARM Cortex-M3 MCU -- http://www.ifixit.com/Teardown/Nest+Learning+Thermostat+2nd+Generation+Teardown/13818

Pebble watch

ARM Cortex-M3 processor

Arduino

Arduino Due: "The Due makes use of Atmel's SAM3U ARM-based process, which supports 32-bit instructions and runs at 96Mhz. The Due will have 256KB of Flash, 50KB of SRAM, five SPI buses, two I2C interfaces, five serial ports, 16 12-bit analog inputs and more. This is much more powerful than the current Uno or Mega.". Cortex M3

http://arduino.cc/en/Main/arduinoBoardDue says: 96 KBytes of SRAM. 512 KBytes of Flash memory for code.

Phone baseband processors

" This operating system is stored in firmware, and runs on the baseband processor. As far as I know, this baseband RTOS is always entirely proprietary. For instance, the RTOS inside Qualcomm baseband processors (in this specific case, the MSM6280) is called AMSS, built upon their own proprietary REX kernel, and is made up of 69 concurrent tasks, handling everything from USB to GPS. It runs on an ARMv5 processor. "

Beaglebone

http://beagleboard.org/Products/BeagleBone%20Black

Processor: AM335x 1GHz ARM® Cortex-A8

    512MB DDR3 RAM
    2GB 8-bit eMMC on-board flash storage
    3D graphics accelerator
    NEON floating-point accelerator
    2x PRU 32-bit microcontrollers

Connectivity

    USB client for power & communications
    USB host
    Ethernet
    HDMI
    2x 46 pin headers

Software Compatibility

    Ångström Linux
    Android
    Ubuntu
    Cloud9 IDE on Node.js w/ BoneScript library
    plus much more

The PRU 32-bit MCUs are:

http://elinux.org/Ti_AM33XX_PRUSSv2

8KB program memory

8KB data memory

Cortex M-series

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0321a/BIHEADII.html

" The Cortex-M0, Cortex-M0+, Cortex-M1, Cortex-M3, and Cortex-M4 processors do not have any internal cache memory. However, it is possible for a SoC? design to integrate a system level cache. Note

A small caching component is present in the Cortex-M3 and Cortex-M4 processors to accelerate flash memory accesses during instruction fetches. "

Teensy 3.0

http://www.kickstarter.com/projects/paulstoffregen/teensy-30-32-bit-arm-cortex-m4-usable-in-arduino-a

Technical Specifications:

    32 bit ARM Cortex-M4 48 MHz CPU (M4 = DSP extensions)
    128K Flash Memory, 16K RAM, 2K EEPROM

ATmega1284P

Used in http://reprap.org/wiki/Melzi and i hear ATmega128 referred to often (e.g. search on this page for "ATmega128").

"The high-performance Atmel 8-bit AVR RISC-based microcontroller combines 128KB ISP flash memory with read-while-write capabilities, 4KB EEPROM, 16KB SRAM, 32 general purpose I/O lines, 32 general purpose working registers, a real time counter, three flexible timer/counters with compare modes and PWM, two USARTs, a byte oriented 2-wire serial interface, an 8-channel 10-bit A/D converter with optional differential input stage with programmable gain, programmable watchdog timer with internal oscillator, SPI serial port, a JTAG (IEEE 1149.1 compliant) test interface for on-chip debugging and programming, and six software selectable power saving modes. The device operates between 1.8-5.5 volts.

By executing powerful instructions in a single clock cycle, the device achieves throughputs approaching 1 MIPS per MHz, balancing power consumption and processing speed. " -- http://www.atmel.com/devices/atmega1284p.aspx

Apple I

4K or 8K bytes RAM, expandable to 65K.

Apple II

"...a MOS Technology 6502 microprocessor running at 1 MHz, two game paddles,[9] 4 kB of RAM, an audio cassette interface for loading programs and storing data, and the Integer BASIC programming language built into the ROMs. The video controller displayed 24 lines by 40 columns of monochrome, upper-case-only (the original character set matches ASCII characters 20h to 5Fh) text on the screen, with NTSC composite video output suitable for display on a TV monitor, or on a regular TV set by way of a separate RF modulator. The original retail price of the computer was $1,298 USD[10] (with 4 kB of RAM) and $2,638 USD (with the maximum 48 kB of RAM).[11]"

Apple II Plus

" The Apple II Plus, introduced in June 1979,[20][21][22][23] included the Applesoft BASIC programming language in ROM. This Microsoft-authored dialect of BASIC, which was previously available as an upgrade, supported floating-point arithmetic, and became the standard BASIC dialect on the Apple II series (though it ran at a noticeably slower speed than Steve Wozniak's Integer BASIC).

Except for improved graphics and disk-booting support in the ROM, and the removal of the 2k 6502 assembler/disassembler to make room for the floating point BASIC, the II+ was otherwise identical to the original II. RAM prices fell during 1980–81 and all II+ machines came from the factory with a full 48k of memory already installed. The language card in Slot 0 added another 16k, but it had to be bank switched since the remaining CPU address space was occupied by the ROMs and I/O area. For this reason, the extra RAM in the language card was bank-switched over the machine's built-in ROM, allowing code loaded into the additional memory to be used as if it actually were ROM. Users could thus load Integer BASIC into the language card from disk and switch between the Integer and Applesoft dialects of BASIC with DOS 3.3's INT and FP commands just as if they had the BASIC ROM expansion card. The language card was also required to use the UCSD Pascal and FORTRAN 77 compilers, which were released by Apple at about the same time. These ran under the UCSD p-System operating system, which had its own disk format and emitted code for a "virtual machine" rather than the actual 6502 processor. "

Apple IIe

" The Apple II Plus was followed in 1983 by the Apple IIe, a cost-reduced yet more powerful machine that used newer chips to reduce the component count and add new features, such as the display of upper and lowercase letters and a standard 64 kB of RAM.

The IIe RAM was configured as if it were a 48 kB Apple II Plus with a language card; the machine had no slot 0, but instead had an auxiliary slot that for most practical purposes took the place of slot 3, the most commonly used slot for 80-column cards in the II Plus. "

Random Comments

http://lifehacker.com/how-to-pick-the-right-electronics-board-for-your-diy-pr-742869540

"It really makes zero sense to use an 8 bit uCtlr just about anywhere anymore, when you can get an ARM in the same size package and at nearly the same cost. Since flash dominates the die area in a microcontroller, 8-bit versus 32-bit logic is noise -- it has less cost impact than the package. There are a lot of Cortex-M3 parts in 48 pin packages now that cost only slightly more than 8 bit parts. (I should point out that there is huge difference between, an ARM Cortex-M3 and an ARM-A9, for instance an MMU.)

In the end, it comes down to MIPS and MFLOPS, and the die area and power required to do that much computation. When an ARM has enough functional units to match the MIPS and MFLOPS of an x86, it will take as much die area and power. At the complexity level of a Pentium IV, the added ugliness of the X86 instruction set is pretty much noise in the total die area and power. (In a past life I was an instruction decode and pipeline control logic design specialist -- I can tell you that x86 instruction decode is as ugly as it comes -- and in the day and age of out-of-order execution, that almost doesn't matter, except that because of all that ugliness x86 code is freakishly dense, which means the same size I-cache holds a lot more useful code. When you toss in the fact that the ugliness is also guarantees employment for instruction decode specialists, I'd call that a win :) "

" I use frequently the cheap $9.90 ARM 32 bits LPC2103 board, running at 48MHz, 32KB flash and 8KB RAM: http://www.wayengineer.com/index.php?main_page=product_info&products_id=129 "

" Wow that pdip cortex-m0 is awesome ! It can even run directly from 2 AAA batteries and is debuggable by openocd without ugly kludges. Once it sells on ebay or anywhere with cheap intl shipping, I’m definitely there ! The only shame is it doesn’t have a usb device interface (but you could make a software-based usb low speed device with tons of time to spare). Thanks a lot, hack a day ! "

"

Hussam Al-Hertani says: August 13, 2012 at 10:35 am

There are many ARM Cortex Boards out there. The best (non-arduino clones) in my opinion are the LPCXpresso boards/IDE. The boards cost $20-$30USD, come in many variations, LPC1114(M0)/LPC11U14(M0)/LPC1227(M0)/LPC1343(M3)/LPC1769(M3( with a hardware debugger and a LPCXPresso Developer environment limited to 128KB program size but thats a lot of programming…infact thats more memory than the program memory available on most of the micros on the LPCXpresso boards. The fact that a 28-pin DIP version of the LPC1114 (32KB Flash) micro is available is also nice though I’d like to see a DIP version with more flash memory…at least 64KB. ... The other two boards that have been announced but not produced are the Cortex-m0+ Freedom board from freescale and cortex-m4 Stellaris launchpad. The Cortex-m0+ Freedom board is $13. You get a Dev board (arduino compatible) with debugger. Hopefully they will be compatible with Freescale’s Evaluation version of CodeWarrior? which is limited to 128KB. Again this should mean practically free development cost for the board since the on-board micro has 128KB of memory on it. The last one that (looking real good!) is the $5 Stellaris Launchpad (M4). The board will probably have an on-board debugger and some development tools support via TI’s code composer studio. "

"

peterbjornx says: August 13, 2012 at 2:47 pm

At the moment i’m designing an arduino pin-compatible board based around the TI LM3S817 (Cortex-M3 with 64kB rom and 2kB RAM), which i might also adapt to bigger Stellaris MCU’s "

" My personal favorite 32bit controller is the Maple. But all of my embedded projects have used 8 and 16-bit PICs. My temporarily embedded projects largely use Arduino-ites. "

" I'm writing this comment because I happen to be in the position of working on several projects across many microcontroller lines that have been mentioned in this post – ultra-low-end 8-bit PIC12/16Fs, 32-bit PIC32s, and 32-bit ARMs (mostly Cortex-M0).

Now I know my recent experience with PIC12/16s isn't particularly representative of most hobby 8-bit applications, as PIC16s are pretty crappy architectures in the 8-bit hobby world, which has AVRs (probably one of the highest performance but also most expensive 8-bit chips out there) taking a pretty big market share.

By far the best reason I can give you for switching to 32-bit is simply that they have so many features and perform so fast that they're just easier to work with. The 8-bit project I've been working on is a fairly extensive but also price constrained project with a custom low level infrared protocol decoder, several PWM outputs, and a software instruction decoder. As you can imagine, coding this to work was a b. However, at the sub $1 price point necessary per chip, there wasn't much of an alternative (I hadn't heard of Cortex-M0s / they hadn't come out when we originally spec'd the hardware for the project).

Now I absolutely wished we'd been able to use an ARM. At 48MHz, roughly 1-2 clocks per instructions, 13(!!!!!) PWM outputs, ADCs, 40+ GPIOs, an LPC11xx series does just about everything you wanted with very little work, and virtually no need to worry about not having enough clock cycles. I needed an infrared translator board (still needed processing–implemented a custom serial protocol and the same custom infrared protocol, with some automatic command timing specific to the application) and managed to finish the code literally in two days' worth of free time, compared to about 3-4 weeks for the 8-bit part of the application. No need to be concerned about how I implement passing variables (used good ol' reusable queue structs rather than static queues), using greater than 8-bit variables, etc–I could focus my time on getting my project to work rather than dealing with 8-bit nonsense (16-bit timers on 8-bit micros is a PAIN when buffering isn't available on your chip!)

You can find dev boards at the <$10 range. ... Anyway, to end this, I'm going to link to elm-chan (crazy dude who made his own analog feedback loop to drive a high performance laser projector) and his article on switching from 8-bit to 32-bit Cortex-M0s. It's in Japanese, so you'll have to do a translation, but it's well worth the read as it really matched up pretty well with my experiences and I think many hobbyists' experiences with switching to 32-bit. "

" 8-Bit microcomputer on-board RAM is at most 8 to 16 K bytes, because beyond this RAM is required when 32-bit will be to select a microcontroller. "

LeafLabs Maple

http://leaflabs.com/devices/maple/

Flash Memory: 128 KB SRAM: 20KB

BeagleBoard

JVMs for low-end embedded systems

On PCs, the JVM is known for being relatively memory-hungry, so it's interesting to wonder about how low-end can JVMs go, and how do they do it?

KESO

http://www.embedded.com/electronics-blogs/cole-bin/4389892/KESO--A-Java-VM-an-MCU-developer-could-love--Maybe-

" Unlike traditional JVMs, KESO is designed to operate in tandem with an RTOS, leaving to the RTOS many functions traditional JVMs might take on. For example, it does not implement thread scheduling and thread synchronization, but uses an existing RTOS to do these tasks. As with most Java implementations, it makes selective use of garbage collection, but when it does, it uses a bare-bone, primitive form that assumes operation in a priority-based scheduling environment, another function it lets devolve to the RTOS. Also, rather than depend on the traditional Java VM compiler to generate executable Java code, KESO is provided with a companion compiler called “jino” that generates native C code in an ahead-of-time fashion, making possible a very slim run time environment and fast execution times. ... Rather than just any C code, KESO generates ISO-C90 code, which has a number of advantages. First, because it makes use of a standard C compiler available for most embedded microcontrollers, there is no need for a compiler backend for each target microcontroller. ... In the 8-bit domain, a typical memory configuration is about 8 kilobytes of flash and about 500 to 600 bytes or so of internal SRAM.

The current compiler backend for KESO requires an OSEK/VDK or an AUTOSAR-compatible OS, the most common platforms used in automotive designs. In that environment, KESO takes advantage of the fact that OSEK/VDX is at its core little more than a scheduler based on static priorities. "

TinyVM

"A really low footprint (< 10 Kb) Java VM and replacement firmware for the Lego Mindstorms RCX microcontroller"

http://tinyvm.sourceforge.net/

" TinyVM?'s footprint is about 10 Kb in the RCX. Additionally, program class files are compacted considerably (i.e. resolved) before they are downloaded to the RCX. A small program can access around 16 Kb of RAM. The overhead for each object is 4 bytes, and there is no alignment of fields (e.g. a byte field always requires one byte).

Features of TinyVM? that aren't always found in other RCX programming systems are listed below.

    Great object oriented language (Java)
    Preemptive threads
    Exceptions
    Synchronization
    Arrays, including multidimensional ones
    Recursion
    Access to RCX buttons
    No need to install a cross-compiler
    Fairly easy to install, even under CygWin
    TinyVM's firmware allows itself to be replaced
    Comes with an emulation tool
    Nicely documented APIs 

Since 0.2.0:

    You can rerun a program.
    Full object persistence across runs.
    tinyvm.rcx.Time with sleep() and currentTimeMillis().
    Timers (tinyvm.rcx.Timer).
    Random numbers (java.util.Random).
    LCD characters (tinyvm.rcx.TextLCD). 

Since 0.2.2:

    Auto power off 

Limitations Evidently, it isn't feasible to put a complete Java runtime in 32 Kb, let alone 10 Kb. The most important limitations and missing features of TinyVM? are:

    No garbage collection
    No floating point support
    No switch statements
    String constants are ignored 

Floating point arithmetic and string constants have already been implemented in leJOS, and it's likely that the other limitations will also be addressed there. There are a lot of interesting features that could be added to TinyVM?