---

    STM32F103 microcontroller
    ARM Cortex M3 architecture
    32-bit 72Mhz processor
    128KB of Flash, 20KB of RAM

i think that has either no instruction cache or an 4K or 8k one, but not at all sure.

the take-home for us is probably the amounts of flash and RAM. again, would be nice to fit the main interpreter in 16k or less, and that the upper limit is about 64k.

---

The E64G401 Epiphany-IV 64-core 28nm Microprocessor has 32KB local (but shared) memory per core (so 32KB x 64 = 2MB total).

http://en.wikipedia.org/wiki/Adapteva

http://www.adapteva.com/wp-content/uploads/2013/06/e64g401_datasheet_4.13.6.14.pdf

---

woah these are cheap:

https://en.wikipedia.org/wiki/Odroid

i think the Exynos 4412 has 32KB/32KB L1 Cache -- http://malideveloper.arm.com/develop-for-mali/development-platforms/hardkernel-odroid-u2-development-platform/

---

http://linuxgizmos.com/intel-unveils-tiny-x86-minnowboard-max-open-sbc/

Raspberry Pi: $25/$35 BeagleBone? Black: $45 MinnowBoard? SBC: $99

" tdicola 13 hours ago

link

This looks neat for people that want a cheap board to hack on embedded Linux. However for serious control of signal generation, acquisition, PWM, servos, etc. you really don't want to be running a multitasking OS. Something like the Beaglebone Black, with its dedicated 200mhz programmable units in addition to embedded Linux, is much more interesting for hackers and makers IMHO.

reply "

" stonemetal 6 hours ago

link

PRU-> programmable real time unit

BBB-> BeagleBone? Black

The BBB has an extra dual core processor that runs at 200Mhz. It is interesting because it is like the processor they teach you about in your intro to computer architecture classes, every instruction is a single cycle instruction. Since it is a co-processor(not running an OS but controllable from the BBB's OS) and execution of instructions is deterministic, it is a good choice for running hard real time code. "

" ah- 13 hours ago

link

I wouldn't call the minnowboard a microcontroller, it's more similar to other single board computers like the Pandaboard and the odroid boards. And 2GB are already common for such boards, so 4GB are really not far off.

reply "

outside1234 6 hours ago

link

Does anyone know how the performance on something like this stacks up to something like the Raspberry Pi?

wmf 5 hours ago

link

A 1.4 GHz Silvermont must be many times faster than a 700 MHz ARM11.

reply "

kqr2 14 hours ago

link

Intel also has the Galileo board which is hardware and software pin-compatible with shields designed for the Arduino Uno* R3.

http://www.intel.com/content/www/us/en/intelligent-systems/g...

makomk 11 hours ago

link

The Galileo's one of those boards where it's very important to pay attention to the fine print. For example, the GPIO controller is hanging off a relatively slow I2C port, so access to GPIO is much, much slower than even the lowest-end Arduino. Also, it's a modified 486 which takes multiple clock cycles to carry out many instructions that are single-cycle on modern ARM, so it's not as fast at arithmetic as the clock speed would suggest.

tdicola 14 hours ago

link

Be careful though, the Galileo emulates AVR code and is orders of magnitude slower than a real Arduino. Don't expect to pick up any shield and make it work, unfortunately.

jpwright 3 hours ago

link

The Galileo actually only emulates a subset of the Arduino libraries. The AVR libraries themselves are, for the most part, not supported. This makes many popular libraries unusable even when hardware is not an issue.

reply "

" elnate 14 hours ago

link

How does this (note: the MinnowBoard? SBC) compare to a Raspberry Pi?

vonmoltke 9 hours ago

link

Comparing the $99 version to the B ($35):

Double the clock speed (1.46GHz vs. 700MHz)
Double the RAM (1GB vs 512MB)
8GB of on-board flash
Gigabit ethernet instead of 10/100
A USB 3.0 port
A SATA II port
A PCIe 2.0 lane
JTAG
x86 architecture

Overall, probably worth the extra cost if you need the power and features. The question is, who does? I'm considering this for no other reason than I want a board in this form factor and power class that has SATA and PCIe.

nullc 6 hours ago

link

The RPI is really obscenely slow, far slower than the clock rate would suggest even for an arm. The RPI is pretty exciting as a microcontroller, though it's power usage is very high, but as a computer it's a real disappointment.

The real comparison should be with the odroid boards: http://hardkernel.com/main/products/prdt_info.php?g_code=G13... a quad arm (cortex-a9) at 1.7GHz with 2GB ram for ~$60.

reply "

(i already read this): http://www.digikey.com/en/articles/techzone/2012/jun/low-power-16-bit-mcus-expand-the-application-space-between-8--and-32-bit-options

---

a picture while discussing L4 cache in Crystalwell's eDRAM:

http://www.anandtech.com/show/6993/intel-iris-pro-5200-graphics-review-core-i74950hq-tested/3

so memory hierarchy jumps after 32k, 256k, 4M. Also, the text notes that both Intel and Microsoft found that 32M was a good amount of eDRAM to have.

---

https://en.wikipedia.org/wiki/Calxeda apparently had these manycore building block product:

"In March 2011 Calxeda announced a 480-core server in development, consisting of 120 quad-core ARM Cortex-A9 CPUs.[3][4][5] .. EnergyCore? ECX-1000, featuring four 32-bit ARMv7 Cortex-A9 CPU cores operating at 1.1–1.4 GHz, 32 KB L1 I-cache and 32 KB L1 D-cache per core, 4 MB shared L2 cache, 1.5 W per processor, 5 W per server node including 4 GB of DDR3 DRAM, 0.5 W when idle.[8][9] Each chip included five 10 gigabit Ethernet ports. Four chips are carried on each EnergyCard?.[8] "

Tilera's TILE-Gx8072 with 72 processors has

" Seventy-two cores operating at frequencies up to 1.2 GHz • 64-bit architecture (datapath and address)

...

32 KB L1 instruction cache and 32 KB L1 data cache per core • 256 KB L2 cache per core

---

http://www.realworldtech.com/haswell-cpu/2/ says the Haswell and Sandy Bridge front end includes a 1.5K "L0" uop cache, in front of the 32k L1 icache.

i guess that's about the same as a 32k icache, if you assume that there's about 1 uop per 16 bytes? But it's probably more like 1 uop per 6 bytes.

some 'matchbox pcs':

http://matchboxpc.thydzik.com/ https://en.wikipedia.org/wiki/Geode_%28processor%29#Geode_GXLV x86 processor 16k unified L1 cache

tiqit (pratt's): used one of these (they call it a '486sx'): http://www.cpu-world.com/CPUs/ElanSC400/ and they reference the expired page http://www.amd.com/products/lpd/techdocs/e86/21030.pdf which is probably http://support.amd.com/TechDocs/21030.pdf . Section 3.4 of that reference manual says the system has an 8k unified L1 cache (and no L2 cache) and uses a 486 instruction set at 100 mhz with no floating point unit.

http://www.pcworld.com/article/2044279/16-small-but-powerful-matchbox-pcs.html 16 small but powerful matchbox PCs By Serdar Yegulalp, Computerworld, Computerworld Jul 13, 2013 9:00 AM tiny PCs that even have their own keyboards like the Qi Ben NanoNote?

..." education-oriented Raspberry Pi ($35)

hobbyist-and-manufacturing-oriented Gumstix Overo series (from $99-$229).

hacker-friendly BeagleBone? Black ($44.95). These are three of the most popular devices in this category.

Other devices have surfaced in the wake of the success of the Raspberry Pi and its peers, each a variant on the theme.

Clockwise, starting at top left: The Gooseberry (about $62) is a repurposed printed circuit board assembly originally developed for tablets rather than an original design like the Pi, but no less use useful for that.

The Rascal Micro ($199) eschews video connectivity in favor of networking, so it can be used as a miniature headless system for controlling other devices.

And the PandaBoard? (and PandaBoard? ES, its successor), at around $175, is pricier than the Pi; it sports a few more connectors and slightly more expandability "

(by ..."' i mean 'contains mostly quotes but with much ellipsis and perhaps even paraphrasing')

$89 Korean-made Odroid U2 (left) packs in a Exynos4412 Prime ARM Cortex-A9 quad-core processor, much faster than the Pi's ARM-powered Broadcom SoC?.

Another board that's used widely in automation projects is the Arduino (right), now available in a whole cornucopia of editions.

The emphasis here isn't on power or speed, though: The Arduino Uno ($55 for the bare board, $60 for a retail box version), shown here, sports an 8-bit RISC processor running at a mere 80MHz (an Intel Core i7 runs around 3GHz).

Boxed up and ready to go

Many matchbox systems come as a bare board, for which you have to supply your own case. These units, on the other hand, come packaged in a case of some kind, courtesy of the manufacturer. They are often used as mini-media centers.

Clockwise, starting at top left: The Cotton Candy ($199) and Rikomagic (about $86) both run Android, while the CuBox? ($119) has additional hobbyist-friendly features, such as a recovery mode that prevents it from being bricked by mistake.

Almost a full PC

These built-up matchbox systems offer a little more breathing room.

Clockwise from top left: The Trim-Slice H packs not only an ARM Cortex-A9 processor and an NVIDIA Tegra 2 chipset but a 2.5-inch SATA hard disk into a fanless case. Prices start at $279, with developer kits available at $175.

The folks at Cappuccino PC build full-blown Intel systems (Atom or Core, your choice); the fanless SlimPro? SP675FP, shown here, measures 10 in. on its longest side and sells for $685.

CompuLab?'s fit-PC3, which starts at $275 with minimal configuration, uses a dual-core 64-bit AMD processor with a 2.5-in. hard disk and a Radeon HD 6250 or 6320 GPU.

Keyboard included

Some even come with a keyboard.

Clockwise, from top left: The Ben NanoNote? runs its own custom build of OpenWrt?, the Jlime distribution or anything else you can get to run on its 336MHz MIPS processor. Only 1,500 pre-manufactured units were made, but the hardware design is available as an open project.

Next up in size, the OpenPandora? (starting at $479), is billed as a mixture of PC and gaming console and is only a little larger than the Nintendo DS.

The Gecko Surfboard ($119) packs an Intel-powered system into a standard-sized keyboard but only uses 5 watts -- hearkening back to the everything-in-the-keyboard design of the Commodore 64/128.

education-oriented Raspberry Pi ($35) 32k unified l1 cache (eg like 16k icache)

hobbyist-and-manufacturing-oriented Gumstix Overo series (from $99-$229). 16k unified cache? ( https://pixhawk.ethz.ch/omap/start )

hacker-friendly BeagleBone? Black ($44.95) 32K/32K L1 cache

The Gooseberry (about $62) Allwinner A10 ARM Cortex-A8 (32+32 L1 cache, 512k L2 cache), Mali 400 graphics

The Rascal Micro ($199) AT91SAM9G20B-CU ? 400 MHz ARM (ARM926EJ-S), 32+32 l1 cache

PandaBoard? $175 TI OMAP4430 dual-core ARM Cortex-A9 CPU, with two ARM Cortex-M3 cores, 32+32k l1 cache

$89 Odroid U2 (left) packs in a Exynos4412 Prime ARM Cortex-A9 quad-core processor 32KB/32KB L1 Cache

Arduino Uno ($55 " The Arduino UNO has only 32K bytes of Flash memory and 2K bytes of SRAM" (and no cache?) https://learn.adafruit.com/memories-of-an-arduino/arduino-memory-architecture

Cotton Candy ($199) 1.2 GHz Exynos 4210 ( ARM Cortex-A9 (32+32k l1 cache) with 1MB L2 cache), Mali 400 graphics

Rikomagic (about $86) 32k+32k ( http://complete-concrete-concise.com/blog/raspberry-pi-and-the-mk802-a-side-by-side-comparison )

CuBox? ($119) Marvell Armada 510 (88AP510) SoC? with ARM v6/v7 (32/32 l1 cache?)

Gecko Surfboard ($119) https://en.wikipedia.org/wiki/Vortex86 16+16k l1 cache

intel galileo 16 KB L1 cache ( http://www.mouser.com/applications/open-source-hardware-galileo-pi/ ) "Adruino says it's 400MHz 32-bit Intel® Pentium instruction set architecture (ISA)-compatible processor o 16 KBytes on-die L1 cache which does not say us much: 80846 and Pentium has very little difference from is ISA POV and later models of both had 16 KByte cache thus 80486 looks plausible, too. "

---

http://iqjar.com/jar/an-overview-and-comparison-of-todays-single-board-micro-computers/

---

in this blog post is a list of popular embedded systems:

    ARM Cortex-M 	
    AVR
    AVR32
    ColdFire
    HC12
    MSP430
    PIC18
    PIC24/dsPIC 	
    PIC32 (MIPS) 	
    PowerPC 	
    RL78 	
    RX100/600 	
    SH 	
    V850 	
    x86

" Third parties offered a wide range of upgrades, for both SX and DX systems. The most popular ones were based on the Cyrix 486DLC/SLC core, which typically offered a substantial speed improvement due to its more efficient instruction pipeline and internal L1 SRAM cache. The cache was usually 1 kB, or sometimes 8 kB in the TI variant. " -- http://en.wikipedia.org/wiki/Intel_80386

---

intel Quark (which is the CPU of Intel Galileo) is said to be an updated 486 (google intel+quark+486 to find remarks like this). It says it uses the Pentium instruction set, but apparently this is similar to the 486 instruction set:

http://en.wikipedia.org/wiki/P5_%28microarchitecture%29#Major_improvements_over_i486_microarchitecture

---

The intel 80286 had a 24-bit address bus and was able to address up to 16 MB of RAM, compared to 1 MB for its predecessor.

Misc

http://www.cs.arizona.edu/~arvind/papers/lctes03.pdf

T able 1: AX Instructions AX Instruction Description

setpred support for predication in 16-bit code

setsbit sets the ’S’ bit to a v oid e xplicit cmp instructions

setsource sets the source re gister for the ne xt instruction

setdest sets the destination re gister for the ne xt instruction

setthird sets the third operand (support 3-address format)

setimm sets the immediate v alue for the ne xt instruction

setshift sets the shift type and amount for the ne xt instruction

setallhigh indicates ne xt instruction uses all high re gister

another libc:

https://github.com/lpsantil/rt0 https://news.ycombinator.com/item?id=8974024

really tiny (claims to be the smallest in the world):

michigan micro mote:

"8-bit CPU, a 52x40-bit DMEM, a 64x10-bit IMEM, a 64x10-bit IROM" [1]

---

	A reimplementation of NetBSD using a Microkernel [video] (youtube.com)
	149 points by agumonkey 2 days ago | flag | 83 comments
	
	
	agumonkey 2 days ago

Youtube video description:

Based on the MINIX 3 microkernel, we have constructed a system that to the user looks a great deal like NetBSD?. It uses pkgsrc, NetBSD? headers and libraries, and passes over 80% of the KYUA tests). However, inside, the system is completely different. At the bottom is a small (about 13,000 lines of code) microkernel that handles interrupts, message passing, low-level scheduling, and hardware related details. Nearly all of the actual operating system, including memory management, the file system(s), paging, and all the device drivers run as user-mode processes protected by the MMU. As a consequence, failures or security issues in one component cannot spread to other ones. In some cases a failed component can be replaced automatically and on the fly, while the system is running, and without user processes noticing it. The talk will discuss the history, goals, technology, and status of the project.

The latest work has been adding live update, making it possible to upgrade to a new version of the operating system WITHOUT a reboot and without running processes even noticing. No other operating system can do this.

The system is built on MINIX 3, a derivative of the original MINIX system, which was intended for education. However, after the original author, Andrew Tanenbaum, received a 2 million euro grant from the Royal Netherlands Academy of Arts and Sciences and a 2.5 million euro grant from the European Research Council, the focus changed to building a highly reliable, secure, fault tolerant operating system, with an emphasis on embedded systems. The code is open source and can be downloaded from www.minix3.org. It runs on the x86 and ARM Cortex V8 (e.g., BeagleBones?). Since 2007, the Website has been visited over 3 million times and the bootable image file has been downloaded over 600,000 times. The talk will discuss the history, goals, technology, and status of the project.

Animats 2 days ago

That's nice, but late. QNX had that 10-15 years ago. With hard real time scheduling, too.

All you really need in a practical microkernel is process management, memory management, timer management, and message passing. (It's possible to have even less in the kernel; L4 moved the copying of messages out of the kernel. Then you have to have shared memory between processes to pass messages, which means the kernel is safe but processes aren't.)

The amusing thing is that Linux, after several decades, now has support for all that. But it also has all the legacy stuff which doesn't use those features. That's why the Linux kernel is insanely huge. The big advantage of a microkernel is that, if you do it right, you don't change it much, if at all. It can even be in ROM. That's quite common with QNX embedded systems.

(If QNX, the company, weren't such a pain... They went from closed source to partially open source (not free, but you could look at some code) to closed source to open source (you could look at the kernel) to closed source. Most of the developers got fed up and quit using it. It's still used; Boston Dynamics' robots use it. If you need hard real time and the problem is too big for something like VxWorks?, QNX is still the way to go.)

vezzy-fnord 2 days ago

QNX is fascinating on its own, but MINIX 3 is still a different project in that its full adoption of a NetBSD? userland will probably make it more useful for generic servers and workstations as well. They also seem to be going much deeper with checkpointing and dynamic upgrades/hot code reloading.

If you need hard real time and the problem is too big for something like VxWorks?, QNX is still the way to go.

There's all sorts of much tinier RTOS like FreeRTOS?, MicroC?/OS and Contiki that are used out there for particularly critical and/or constrained environments.

 nchelluri 1 day ago

When is VxWorks? inappropriate, but QNX appropriate?

EDIT: http://www.embeddedrelated.com/showthread/comp.arch.embedded... says:

> the most fundamental difference between VxWorks? and QNX is as you have described, QNX lends itself to a message passing architecture while VxWorks? lends itself to a shared memory architecture.

> My personal opinion is that a message passing architecture is easier to get to grips with and as such is potentially easier to understand and debug.

> However, the majority of software engineers with experience of an embedded RTOS will be very well informed about the Shared Memory architecture.

gte525u 1 day ago

I think it's less of an issue now than say 10 years ago. VxWorks? 6.x added support for protection domains (MPU/MMU support) and RTPs (real time processes). VxWorks? 5 everything operated in the kernel. Even with 6.x very little typically runs by default in user space on a VxWorks? setup.

With respect to the message passing - both support messaging. VxWorks? has several types of message queues - vxworks proprietary msgQLib API, POSIX api etc. QNX has much the same MsgSend?/MsgRecv? which is the microkernel API and POSIX. QNX has an add-on PubSub? middleware that the OP of the usenet group may be thinking of.

saosebastiao 1 day ago

Would this imply no support for mmap(or similar) in qnx? Or just not very optimal to use it?

gte525u 1 day ago

Both support mmap and shared memory - that's why I found the "shared memory" usenet post a little puzzling.

unethical_ban 2 days ago

I'm watching the video now, but are you suggesting that QNX, which is not Free and Open Source, has already accomplished MINIX's stated goals of OS reliability?

I would like to hear Mr. Tanenbaum's answer to the less provocative form of the sentiment: "What design decisions were made with MINIX3 that other RTOS with microkernels didn't consider?"

nickpsecurity 2 days ago

Tanenbaum cited QNX in Round 2 of the microkernel debate between he and Linus. It's had all sorts of great traits for a long time. It also had plenty of development time and a rip-off open source model to give it capabilities. Like Tanenbaum said in his paper, Minix 3 has had a small amount of core developers working on it for a relatively short amount of time. There's no way Minix 3 will trump QNX with such small resources and I doubt they planned to. It's more a start on building something using better engineering principles that might eventually become a great alternative to other UNIX's and Linux.

jacquesm 2 days ago

QnX? achieved Minix's stated goals of OS reliability 20 years ago.

And Minix isn't a micro kernel in the same way that QnX? is.

 jacquesm 1 day ago

You have a much smaller number of bugs because (a) each component is much simpler (b) runs as a separate process and so can be debugged and worked on by mere mortals and (c) works using a well defined interface (message passing) which makes testing and debugging a much simpler affair.

stox 2 days ago

I think UNIX-RTR has met those goals.

pjmlp 1 day ago

Thanks for pointing it out. I wasn't aware of it.

carussell 2 days ago

I took a serious look at MINIX over the winter, and digested several of Tanenbaum's talks around that time. (For anyone wondering if this talk contains anything substantially different from past ones, the answer is no.)

Here are some things to add:

Nowadays x86 is built with LLVM by default, and ARM is using GCC.
X11 is mentioned in the video, but the 3.3.0 release from last fall didn't ship with a working X server, although past releases have. There is a message on the mailing list from someone who writes that they've got it working on a subsequent snapshot release.
You may have heard something in the past about 10 minute build times. That info is out of date as of the switch to NetBSD? userspace for 3.3.0. MINIX itself (i.e., all the interesting parts) still only takes about 10-15 minutes to build, but there's no way to just slurp down the sources for its kernel/drivers/servers to build a MINIX "core" and then supplement it with the prebuilt binaries for userspace (at least, not without doing some significant work on your end to allow for that). The initial build for x86 on my modest machine takes about 3 hours, almost all of it spent building LLVM twice.
For anyone looking for serious collaborators, MINIX is seriously lacking in infrastructure from a project/community standpoint. E.g., a fair bit of documentation is missing and much of what you will find on the wiki is out of date. Development processes are neither documented nor easily discoverable because there are effectively no development processes in place. Until about six months or so ago, MINIX was without a bugtracker. Organizationally/project-wise, the whole thing is pretty sparse.
If you have watched previous talks on MINIX, e.g. FOSDEM 2010, you will be familiar with the open calls for those interested in working for pay, using the money from the two grants mentioned in this video. That money is now gone. During that time, MINIX was basically a research project run by grad students who were working ~full time on MINIX, with the funding from those grants. It's the same now as far as the student-run aspect goes, but with drastically fewer contributions. Not much of the paid man hours seem to have gone towards scaffolding out project infrastructure as I mentioned before, or the sorts of drudge work that volunteers are unlikely to take up.
The code quality is... I dunno, fair? As I mentioned before, there are/were effectively no development processes in place; no code review, etc. So there's a fair bit of nastiness that got checked in directly, like copy and paste, especially among the arch-specific boot time stuff; the comments are fairly sparse; and you can find dead code and references to routines/fields that either no longer exist or have been renamed, even in the relatively short (~500 line) main.c.

For anyone looking in to maybe starting to work with MINIX, I'd suggest assessing whether or not you would be comfortable striking out and doing things on your own, and then being prepared to do so. With MINIX, you aren't going to find a thriving community that you can just add your piece to, so as to contribute to the effort. You might run into a certain level of that sort of old-guard, paralyzing stop energy, so in a way it's got a lot of the downsides of a greenfield project except with few, if any, of the upsides.

---

Wirth's RISC0 had 8K 32-bit words of program and 8K words for data (Harvard architecture eg code separate from data)

Wirth's RISC5 had 1MB (von Neumann architecture eg code and data in same memory space).

-- http://www.inf.ethz.ch/personal/wirth/FPGA-relatedWork/RISC-Arch.pdf

---

IBM's TrueNorth? chip family, not yet commercially available, is a (weakly) neuromorphic, deterministic integrate-leak-fire artificial neural network simulator. Its claim to fame is its low-power operation.

https://en.wikipedia.org/wiki/TrueNorth http://research.ibm.com/cognitive-computing/neurosynaptic-chips.shtml

4096 cores of 256 neurons each; 256 synapses per neuron; i think no memory except that contained in the states of the synapses and the states of the neuronal cell bodies; 70mW power consumption

Paper on their 'corelet' programming language. Afaict the only thing it really says is that neural networks can be used as library modules inside other neural networks. Duh. Really i think this is just one of those papers describing a boring but important implementation step, because you have to do that to get credit. Annoyingly afaict their implementation is not actually available online, so there's nothing of interest here for us until they put it up.

http://www.research.ibm.com/software/IBMResearch/multimedia/IJCNN2013.corelet-language.pdf

(the most interested part is when they describe what's in their (non-available) library repository:

" The corelets currently in the Corelet Library include scalar functions, algebraic, logical, and temporal functions, splitters, aggregators, multiplexers, linear filters, kernel convolution (1D, 2D and 3D data), finite-state machines, non-linear filters, recursive spatio-temporal filters, motion detection, optical flow, saliency detectors and attention circuits, color segmentation, a Discrete Fourier Transform, linear and non-linear classifiers, a Restricted Boltzmann Machine, a Liquid State Machine, and more. The corelet abstraction and unified interfaces enable developers to easily replace a library corelet with an alternative implementation without disrupting the rest of the system. " )

paper on their simulator:

http://www.modha.org/blog/SC12/SC2012_Compass.pdf

article: http://www.extremetech.com/extreme/187612-ibm-cracks-open-a-new-era-of-computing-with-brain-like-chip-4096-cores-1-million-neurons-5-4-billion-transistors

---

ok this is high end but:

http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/high-performance-xeon-phi-coprocessor-brief.pdf http://ark.intel.com/products/family/71840/Intel-Xeon-Phi-Coprocessors#@Server

intel xeon phi parallel coprocessor, ~8-16GB, ~1gHz freq, ~60 cores, ~244 threads (total, not per core; 4 threads per core), ~300 wats TDP

---

in Haskell each thread requires 1kb, so 1 million threads per GB of memory -- https://github.com/Gabriel439/post-rfc/blob/master/sotu.md

"Threads in Haskell are very cheap, and many people won't care about one additional thread. However, each thread comes with a stack, which takes memory. The stack starts off small (1Kb) and grows/shrinks in 32Kb chunks, but if it ever exceeds 1Kb, it never goes below 32Kb. For certain tasks (e.g. Shake build rules) often some operation will take a little over 1Kb in stack. Since each active rule (started but not finished) needs to maintain a stack, and for huge build systems there can be 30K active rules, you can get over 1Gb of stack memory. While stacks and threads are cheap, they aren't free." -- http://neilmitchell.blogspot.com/2014_06_01_archive.html

"Go 1.3 will have a minimum stack size back down at 4 kB. We hope that Go 1.4 will be able to ratchet the minimum stack size down to 1 or 2 kB." – Russ Cox Mar 11 '14 at 21:49 http://stackoverflow.com/questions/22326765/go-memory-consumption-with-many-goroutines#comment33947609_22333024

see also https://github.com/golang/go/issues/7514 , looks like it ended up at 2kb in Go 1.4 (see also https://github.com/golang/go/commit/6c934238c93f8f60775409f1ab410ce9c9ea2357 ): " A consequence is that stacks are no longer segmented, eliminating the "hot split" problem. When a stack limit is reached, a new, larger stack is allocated, all active frames for the goroutine are copied there, and any pointers into the stack are updated. Performance can be noticeably better in some cases and is always more predictable. Details are available in the design document.

The use of contiguous stacks means that stacks can start smaller without triggering performance issues, so the default starting size for a goroutine's stack in 1.4 has been reduced from 8192 bytes to 2048 bytes. " -- https://golang.org/doc/go1.4#runtime

---

https://en.wikipedia.org/wiki/Scratchpad_memory

" Sony's PS2 Emotion Engine employed a 16 KB scratchpad, to and from which DMA transfers could be issued to its GS, and main memory.

... NVIDIA's 8800 GPU running under CUDA provides 16 KB of scratchpad (NVIDIA calls it Shared Memory) per thread-bundle when being used for GPGPU tasks. Scratchpad also was used in later Fermi GPU (GeForce? 400 Series).[5] ... Cache control vs scratchpads

Many architectures such as PowerPC? attempt to avoid the need for cacheline locking or scratchpads through the use of cache control instructions. Marking an area of memory with "Data Cache Block: Zero" (allocating a line but setting its contents to zero instead of loading from main memory) and discarding it after use ('Data Cache Block: Invalidate', signaling that main memory didn't receive any updated data) the cache is made to behave as a scratchpad. ... Shared L2 vs Cell local stores

Regarding interprocessor communication in a multicore setup, there are similarities between the Cell's inter-localstore DMA and a Shared L2 cache setup as in the Intel Core 2 Duo or the Xbox 360's custom powerPC: the L2 cache allows processors to share results without those results having to be committed to main memory. This can be an advantage where the working set for an algorithm encompasses the entirety of the L2 cache. However, when a program is written to take advantage of inter-localstore DMA, the Cell has the benefit of each-other-Local-Store serving the purpose of BOTH the private workspace for a single processor AND the point of sharing between processors; i.e., the other Local Stores are on a similar footing viewed from one processor as the shared L2 cache in a conventional chip. The tradeoff is that of memory wasted in buffering and programming complexity for synchronization, though this would be similar to precached pages in a conventional chip. ... Extending the working set, e.g., a sweet spot for a merge sort where the data fits within 8x256 KB "

https://en.wikipedia.org/wiki/Cell_%28microprocessor%29

" A DMA operation can transfer either a single block area of size up to 16KB, or a list of 2 to 2048 such blocks. ... The PPE contains a 64 KiB? level 1 cache (32 KiB? instruction and a 32 KiB? data) and a 512 KiB? Level 2 cache. ...

With the current generation of the Cell, each SPE contains a 256 KiB? embedded SRAM for instruction and data, called "Local Storage" (not to be mistaken for "Local Memory" in Sony's documents that refer to the VRAM) which is visible to the PPE and can be addressed directly by software. Each SPE can support up to 4 GiB? of local store memory. ... The SPEs contain a 128-bit, 128-entry register file and measures 14.5 mm2 on a 90 nm process. An SPE can operate on sixteen 8-bit integers, eight 16-bit integers, four 32-bit integers, or four single-precision floating-point numbers in a single clock cycle, as well as a memory operation. ... Compared to its personal computer contemporaries, the relatively high overall floating point performance of a Cell processor seemingly dwarfs the abilities of the SIMD unit in CPUs like the Pentium 4 and the Athlon 64. However, comparing only floating point abilities of a system is a one-dimensional and application-specific metric. Unlike a Cell processor, such desktop CPUs are more suited to the general purpose software usually run on personal computers. In addition to executing multiple instructions per clock, processors from Intel and AMD feature branch predictors. The Cell is designed to compensate for this with compiler assistance, in which prepare-to-branch instructions are created. For double-precision floating point operations, as sometimes used in personal computers and often used in scientific computing, Cell performance drops by an order of magnitude, but still reaches 20.8 GFLOPS (1.8 GFLOPS per SPE, 6. GFLOPS per PPE). The PowerXCell? 8i variant, which was specifically designed for double-precision, reaches 102.4 GFLOPS in double-precision calculations.[36]

Tests by IBM show that the SPEs can reach 98% of their theoretical peak performance running optimized parallel matrix multiplication.[29] ... The EIB is a communication bus internal to the Cell processor which connects the various on-chip system elements: the PPE processor, the memory controller (MIC), the eight SPE coprocessors, and two off-chip I/O interfaces, for a total of 12 participants in the PS3 (the number of SPU can vary in industrial applications). The EIB also includes an arbitration unit which functions as a set of traffic lights. In some documents IBM refers to EIB participants as 'units'.

The EIB is presently implemented as a circular ring consisting of four 16 bytes wide unidirectional channels which counter-rotate in pairs. When traffic patterns permit, each channel can convey up to three transactions concurrently. As the EIB runs at half the system clock rate the effective channel rate is 16 bytes every two system clocks. At maximum concurrency, with three active transactions on each of the four rings, the peak instantaneous EIB bandwidth is 96 bytes per clock (12 concurrent transactions * 16 bytes wide / 2 system clocks per transfer). While this figure is often quoted in IBM literature it is unrealistic to simply scale this number by processor clock speed. The arbitration unit imposes additional constraints which are discussed in the Bandwidth Assessment section below. "

---

this is used for a flickering candle:

http://www.microchip.com/wwwproducts/Devices.aspx?product=PIC10F200

.375K ROM, 16 bytes in RAM

and the same blog post also suggested one of these:

http://www.atmel.com/products/microcontrollers/avr/tinyavr.aspx

he didn't say which one, but the smallest one is 0.5K (and 8-bit of course)

---

the Apollo Guidance Computer had 12K of 16-bit words of ROM, and 1k of 16-bit words of RAM

---

"If you’re not familiar with Funcards, they’re basically standard AT90S8515 AVR microcontrollers in smartcard format." -- https://www.makomk.com/2010/02/04/arduino-based-funcard-programmer/

http://www.atmel.com/images/doc0841.pdf 8k ROM, 512 bytes of RAM

" AtMega? Card (Funcard) SmartCard? Programming & Fuse Setup

I recently got an Atmel AtMega?163-based smartcard" -- http://colinoflynn.com/2012/09/atmega-card-funcard-smartcard-programming-fuse-setup-2/

http://www.atmel.com/Images/doc1142.pdf 16k flash, 1k RAM

---

novena laptops ($2100 including 240G SSD) use Freescale I.MX6, a SOC which:

is ARM Cortex A9
supports GNU/Linux
is (in the Novena) 1.2 GHz quadcore with 64K L1 cache (32k/32k instruction/data; per core); 1MB L2 cache, 32bit,
upgradable to 4GB

https://en.wikipedia.org/wiki/I.MX#i.MX6x_series

https://en.wikipedia.org/wiki/Novena_%28computing_platform%29

http://www.freescale.com/products/arm-processors/i.mx-applications-processors-based-on-arm-cores/i.mx-6-processors/i.mx6qp/i.mx-6quad-processors-high-performance-3d-graphics-hd-video-arm-cortex-a9-core:i.MX6Q?tab=Documentation_Tab&pspll=1&SelectedAsset=Documentation&ProdMetaId=PID/DC/i.MX6Q&fromPSP=true&assetLockedForNavigation=true&componentId=2&leftNavCode=1&pageSize=25&Documentation=Documentation/00610Ksd1nd%60%60Data%20Sheets&fpsp=1&linkline=Data%20Sheets

http://spectrum.ieee.org/consumer-electronics/portable-devices/novena-a-laptop-with-no-secrets

http://cache.freescale.com/files/32bit/doc/data_sheet/IMX6DQIEC.pdf

https://www.crowdsupply.com/sutajio-kosagi/novena

the Novena also has a GPU and an FPGA and a CAAM:

"Vivante GC2000 OpenGL? ES2.0 GPU, 200Mtri/s, 1Gpix/s (*)" "an OpenGL®? ES 2.0 3D graphics accelerator with four shaders (up to 200 MTri/s and OpenCL? support), 2D graphics accelerator, and dedicated OpenVG™? 1.1 accelerator" ~500Mhz,
- "gc2000 can deal with char (8bit), short int (16bit), int(32bit), float(32bit), but not long and double."
- CL_DEVICE_GLOBAL_MEM_SIZE: 96 MByte
- CL_DEVICE_MAX_MEM_ALLOC_SIZE: 48 MByte
- CL_DEVICE_GLOBAL_MEM_CACHE_SIZE: 4096
- CL_DEVICE_LOCAL_MEM_SIZE: 1 KByte
"Spartan-6 CSG324-packaged FPGA (PVT uses LX45: 43k logic cells, 6.8k slices, 54.5k ff, 401kb distributed RAM, 58 DSP48A, 2088kb block RAM) — has several interfaces to the CPU, including a 2Gbit/s (peak) RAM-like bus — for your bitcoin mining needs. Or whatever else you might want to toss in an FPGA."
"CAAM—Cryptographic? Acceleration and Assurance Module, containing 16 KB secure RAM and True and Pseudo Random Number Generator (NIST certified)"

http://www.cnx-software.com/2013/01/19/gpus-comparison-arm-mali-vs-vivante-gcxxx-vs-powervr-sgx-vs-nvidia-geforce-ulp/

The computing power of the Vivante's GC2000 GPU in i.mx6q

---

mafuyu 4 hours ago

ARM isn't open sourcing anything anytime soon. Take a look at OpenRISC? and RISC-V. They're aiming at a fully open source SoC? implementation in silicon.

http://openrisc.io/

http://riscv.org/

agumonkey 4 hours ago

But IIRC (some blog benchmark) they are very very slow.

zhemao 2 hours ago

Also, the benchmarks we did a while ago on a taped-out chip with our in-order RV64 core, Rocket, showed that it compared quite favorably to an ARM Cortex A5.

http://riscv.org/download.html#tab_rocket_core

Unfortunately, it would be quite difficult for outside organizations to replicate these measurements unless they can pay TSMC for a fab run.

zhemao 2 hours ago

(Disclaimer: I am a PhD? student in the Berkeley computer architecture group, which designs RISC-V)

An ISA can't be "fast" or "slow". It's just a specification. There's no reason you can't build a RISC-V core that's just as fast as an ARM or x86 core. The only reason we haven't done so is because we don't have access to the modern fabrication technologies that Intel and commercial ARM licensees use.

theresistor 1 hour ago

Instruction sets very much can be fast or slow, at least in the context of discussing specific use cases. Many of the inner loops that take up most of the active cycles on a CPU today (crypto, compression, imaging, signal processing, linear algebra) have very specific code patterns that are can be targeted with specialized instructions that provide integer-multiple reductions in instruction count.

Let's assume that application-targeted instructions can reduce the size of the inner loops in these applications by 2x. Even the RISCiest cores do not, in practice, run at 2x the clock speed or 2x the issue rate of cores with application-targeted instructions. Thus, ISAs with baked in support for these use-case accelerating instructions will be more performant.

The RISCy core probably wins on mm^2, but a perf/mm^2 analysis will be highly dependent on how well designed the application-specific instructions are for area conservation.

---

nickpsecurity 4 hours ago

I know. I'm just assuming FPGA is trusted in TCB, esp with DMA, plus implying many people will want to use it. Your assessment of open HW is accurate. Far as FPGA's, there's progress on several fronts. Some for you to check out that your people might even consider using given the continual payoff of a FPGA w/out high unit costs.

Open-source FPGA architecture at 45nm http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-43...

Open-source bitstream generation for Xilinx w/out EULA violation or R.E. (!?) http://www.isi.edu/~nsteiner/publications/soni-2013-bitstrea...

Open tools for Lattice http://www.clifford.at/icestorm/

Open-source HW synthesis flow http://opencircuitdesign.com/qflow/

Note: Cliff is on a winning streak, eh?

I recently did a write-up showing what it could or would take at both ASIC and production levels here:

https://news.ycombinator.com/item?id=10468534

https://news.ycombinator.com/item?id=10468624

So, paths are clear and plenty of potential. Just no uptake. Will take government, corporate, or private sponsors working with academic (low NRE) and professional (experienced) HW designers that open-source stuff as they go. Proprietary, dual-licensed OSS is the only way to go for HW that I know of.

---

kawsper 4 hours ago

People are afraid of the Intel Management Engine, and you can read about it here: http://libreboot.org/faq/#intel

It is a dense and very detailed text, but basically your Intel CPU contains Active Management Technology (AMT) which lets remote users control your computer, which may or may not be what you want, and there might be backdoors hiding here.

It also includes Intel Boot Guard, which prevents users from installing their own firmware (such as libreboot and coreboot) because it needs to be signed with a key from Intel.

The page sums it up like this:

> In summary, the Intel Management Engine and its applications are a backdoor with total access to and control over the rest of the PC. The ME is a threat to freedom, security, and privacy, and the libreboot project strongly recommends avoiding it entirely. Since recent versions of it can't be removed, this means avoiding all recent generations of Intel hardware.

jakeogh 3 hours ago

Also see Joanna Rutkowska's recent paper: https://news.ycombinator.com/item?id=10458318 (fun reading).

---

groby_b 4 hours ago

Small, since that's how they make their money (design licenses)

But there's a completely open ARMv2 clone: http://opencores.org/project,amber

JoachimS? 51 minutes ago

Which is probably the third open source attempt. ARM usually stomps hard on these project. Anybody remember blackARM, nn ARM?

Clones are even worse out. TurboSilicon? was attacked so hard they were thankful to hand all their assets to ARM.

---

nextos 5 hours ago

We need a cheap general purpose Novena-like laptop.

I'd argue a really good place to start from is a custom Rockchip machine (like Asus C201, which is now supported by Libreboot). Add a free GPU (or finish the Lima driver) and we are ready to go.

---

aij 4 hours ago

Why didn't they use an open source CPU? Eg: OpenSPARC?

xobs 1 minute ago

Aside from the fact that we're familiar with Freescale and ARM, many other chips we "could have used" are unobtanium. While it's true that there is a source-level implementation, I can't find any T1 or T2 parts available for purchase. It's the same reason why we went with an A9 instead of an A15 or a 64-bit chip: You just can't buy them unless you're a big company. And if a small two-person company can't buy them, how can we claim it's open source hardware if you can't buy them either?

The other possibility would be to fab a chip ourselves, but that's a whole other order of magnitude in terms of cost and complexity, and the result isn't that great in terms of speed and available peripherals. Plus, when you fab a chip like this a lot of the hardware blocks are IP provided by the chip foundry, e.g. flash controllers and DRAM cells, and those are always closed-source. It just moves the whole thing one turtle down.

---

ASUS Chromebit CS10 SoC? Rockchip RK3288-C 4 x Cortex A17 + Mali T764 RAM 2 GB LPDDR3 NAND 16GB NAND Dimensions / Mass 123 x 31 x 17mm, 75g OS Chrome OS Other Connectivity 2x2 802.11a/b/g/n/ac + BT 4.0, HDMI 1.4, USB 2.0, DC-in Price $85

-- http://www.anandtech.com/show/9797/asus-launches-the-chromebit-cs10-hdmi-stick

---

" Continuing on from the focus on improving visual performance and responsiveness on Android 4.1 "Jelly Bean", the main objective of Android 4.4 was to optimize the platform for better performance on low-end devices, without compromising its overall capabilities and functionality. The initiative was codenamed "Project Svelte", which Android head of engineering Dave Burke joked was a weight loss plan after Jelly Bean's "Project Butter" added "weight" to the OS.[5] To simulate lower-spec devices, Android developers used Nexus 4 devices underclocked to run at a reduced CPU speed with only two cores active, 512 MB memory, and at qHD resolution—specifications meant to represent a "sweet spot" for entry-level devices.[5] "

---

Raspberry Pi Zero $5

" A Broadcom BCM2835 application processor 1GHz ARM11 core (40% faster than Raspberry Pi 1) 512MB of LPDDR2 SDRAM A micro-SD card slot A mini-HDMI socket for 1080p60 video output Micro-USB sockets for data and power An unpopulated 40-pin GPIO header Identical pinout to Model A+/B+/2B An unpopulated composite video header Our smallest ever form factor, at 65mm x 30mm x 5mm "

and zero W ($10, with wifi):

" 1GHz, single-core CPU 512MB RAM Mini-HDMI port Micro-USB On-The-Go port Micro-USB power HAT-compatible 40-pin header Composite video and reset headers CSI camera connector 802.11n wireless LAN Bluetooth 4.0 "

---

" The processor in the (Apple Macbook) charger is a MSP430F2003 ultra low power microcontroller with 1kB of flash and just 128 bytes of RAM....The 68000 microprocessor from the original Apple Macintosh and the 430 microcontroller in the charger aren't directly comparable as they have very different designs and instruction sets. But for a rough comparison, the 68000 is a 16/32 bit processor running at 7.8MHz, while the MSP430 is a 16 bit processor running at 16MHz. The Dhrystone benchmark measures 1.4 MIPS (million instructions per second) for the 68000 and much higher performance of 4.6 MIPS for the MSP430. The MSP430 is designed for low power consumption, using about 1% of the power of the 68000. "

---

$9 Computer Architecture: The Chips That Make C.H.I.P., C.H.I.P.

Speaker: Dave Rauchwerk Next Thing Co. [2]

About the talk:

Features: 1GHz ARM Cortex A8, 512MB RAM, 4GB NAND Flash, WiFi?, Bluetooth.

Completely open source.

This talk will provide a technical overview of the hardware and software system architecture of the world's first $9 computer.

Uses an Allwinner A8 CPU, whic is an ARM Cortex-A8 with 32k icache and 32k dcache (L1) and 256k L2 cache. https://github.com/NextThingCo/CHIP-Hardware/blob/master/CHIP%5Bv1_0%5D/CHIPv1_0-BOM-Datasheets/Allwinner%20R8%20Datasheet%20V1.2.pdf

---

also pocketCHIP which has a CHIP and a keyboard and screen and runs PICO-8 for $50:

https://getchip.com/pages/pocketchip

---

also the chippro (is this different from the above talk? the above talk speaks of an ARM Cortex A8, but this lists v7-A):

$16 " 1GHz ARMv7-A 256MB/512MB DDR3/SLC NAND I2S Audio Dual Mics WiFi? B/G/N & BT4.2 Fully Certified Open Source HW, OS, No NDAs! " "powered by" $6 R8 SoC? + 256MB DDR3

[2]

Mali400 GPU [3]

interestingly, this is recommended over the Raspberry Pi Zero W (and mb other models? not sure i understand) by multiple commentators b/c you can't get the Zero W in quantity [4]

---

ESP8266 "ESP-12E" ("ESP12"?)

https://www.adafruit.com/images/product-files/2471/0A-ESP8266__Datasheet__EN_v4.3.pdf wi-fi TCP/IP stack but also has a 32-bit MCU ("Tensilica L106") with 16-bit with ~36k user-available RAM, and at least 0.5k flash

---

" With the dramatic success of the IBM PC in the early 1980's, it was obvious that there would someday be lisp implementations for personal computers. But the limitations of the early PC's 16 bit processor and its hobbled memory addressing scheme meant that a Lisp running on the PC would be little more than a toy. Lisp did not take off on the PC until the x386 computers with a 32 bit flat addressing space became plentiful in the late 1980’s . But the Macintosh, despite its well-known limitations, used the same Motorola 68000 CPU used in many engineering workstations. The original Macintosh had only 128k bytes of memory, but it this was more than most PC's at the time, and a number of third party memory expansion kits were available in almost immediately. Apple announced their own memory-enhanced Macintosh within a few months, and it was available in the fall of 1984. Seen this way, the Macintosh was not a more powerful PC, but rather a small inexpensive workstation. There was reason to think that it could be used as a Lisp platform. " -- http://basalgangster.macgui.com/RetroMacComputing/The_Long_View/Entries/2013/2/17_Macintosh_Common_Lisp.html

---

ESP8266-based devkit for NodeMCU? Lua system:

http://www.ebay.com/itm/ESP8266-ESP-12-NodeMCU-Lua-WiFi-Internet-Of-Things-Free-Shipping-Arr-1-10-BizDay-/271730851063?pt=LH_DefaultDomain_0&hash=item3f446bbcf7 $15

NodeMCU?: http://nodemcu.com/index_en.html#fr_54745c8bd775ef4b99000011

Arduino-like hardware IO Nodejs style network API Less than $2 WI-FI MCU ESP8266 integrated and esay to prototyping development kit

code examples at http://wayback.archive.org/web/20151231082750/http://www.nodemcu.com/index_en.html#fr_5475f7667976d8501100000f

copied to here:

Connect to the wireless network

print(wifi.sta.getip()) --nil wifi.setmode(wifi.STATION) wifi.sta.config("SSID","password") print(wifi.sta.getip()) --192.168.18.110

Arduino like IO access

pin = 1 gpio.mode(pin,gpio.OUTPUT) gpio.write(pin,gpio.HIGH) gpio.mode(pin,gpio.INPUT) print(gpio.read(pin))

HTTP Client

-- A simple http client conn=net.createConnection(net.TCP, false) conn:on("receive", function(conn, pl) print(pl) end) conn:connect(80,"121.41.33.127") conn:send("GET / HTTP/1.1\r\nHost: www.nodemcu.com\r\n" .."Connection: keep-alive\r\nAccept: */*\r\n\r\n")

HTTP Server

-- a simple http server srv=net.createServer(net.TCP) srv:listen(80,function(conn) conn:on("receive",function(conn,payload) print(payload) conn:send("

Hello, NodeMCU?.

") end) end)

PWM

function led(r,g,b) pwm.setduty(1,r) pwm.setduty(2,g) pwm.setduty(3,b) end pwm.setup(1,500,512) pwm.setup(2,500,512) pwm.setup(3,500,512) pwm.start(1) pwm.start(2) pwm.start(3) led(512,0,0) -- red led(0,0,512) -- blue

Blinking Led

lighton=0 tmr.alarm(0,1000,1,function() if lighton==0 then lighton=1 led(512,512,512) -- 512/1024, 50% duty cycle else lighton=0 led(0,0,0) end end)

Bootstrap

--init.lua will be excuted file.open("init.lua","w") file.writeline([[print("Hello World!")]]) file.close() node.restart() -- this will restart the module.

Use timer to repeat

tmr.alarm(1,5000,1,function() print("alarm 1") end) tmr.alarm(0,1000,1,function() print("alarm 0") end) tmr.alarm(2,2000,1,function() print("alarm 2") end) -- after sometime tmr.stop(0)

A pure lua telnet server

-- a simple telnet server s=net.createServer(net.TCP,180) s:listen(2323,function(c) function s_output(str) if(c~=nil) then c:send(str) end end node.output(s_output, 0) -- re-direct output to function s_ouput. c:on("receive",function(c,l) node.input(l) --like pcall(loadstring(l)), support multiple separate lines end) c:on("disconnection",function(c) node.output(nil) --unregist redirect output function, output goes to serial end) print("Welcome to NodeMCU? world.") end)

Interfacing with sensor

-- read temperature with DS18B20 t=require("ds18b20") t.setup(9) addrs=t.addrs() -- Total DS18B20 numbers, assume it is 2 print(table.getn(addrs)) -- The first DS18B20 print(t.read(addrs[1],t.C)) print(t.read(addrs[1],t.F)) print(t.read(addrs[1],t.K)) -- The second DS18B20 print(t.read(addrs[2],t.C)) print(t.read(addrs[2],t.F)) print(t.read(addrs[2],t.K)) -- Just read print(t.read()) -- Just read as centigrade print(t.read(nil,t.C)) -- Don't forget to release it after use t = nil ds18b20 = nil package.loaded["ds18b20"]=nil

---

"There are three main companies out there making microcontrollers that are neither ancient 8051 clones or ARM devices: TI’s MSP430 series, Microchip and Atmel. " -- http://hackaday.com/2016/01/20/microchip-to-acquire-atmel-for-3-56-billion/

"Together, Microchip and Atmel will be the #3 MCU company in the world (trailing Renesas (OTCPK:RNECY) and NXP Semicondcutors (NASDAQ:NXPI) after its deal for Freescale), and Microchip will have a very fertile opportunity to drive margin synergies." -- http://seekingalpha.com/article/3817736-microchip-technology-atmel-right-match

---

"For years Microchip top management was like mule on bridge not wanting to step ahead :) They were refusing to buy ARM licensee and bet on MIPS and they were missing a lot of sale opportunities with this odd decision. Whatever they do with PIC32 it’s not so successful like the STM32s and LPCs and they miss sales for millions $$$. This is not because MIPS architecture is bad, quite opposite it’s well developed in networking devices, but MIPS Soc from Mediatek running Linux at 400Mhz cost $2 while Microchip sells MIPS PIC32 with no MMU running at 80Mhz for $5-6." -- https://olimex.wordpress.com/2016/01/20/you-guys-will-buy-your-avrs-from-microchip-from-now-on/

" So where is the evidence that open and free tools matter – well, lets have a look at Arduino – you cannot help but notice that the solution to almost every project that needs a micro-controller these days seems to be solved with an Arduino! and that platform has been built around Atmel parts, not Microchip parts. What happened here? With the Microchip parts you have much more choice and the on-board peripherals are generally broader in scope with more options and capabilities, and for the kinds of things that Arduino’s get used for, Microchip parts should have been a more obvious choice, but Atmel parts were used instead – why was that? ... The success of the Arduino platform is undeniable – if you put Arduino in your latest development product name its pretty much a foregone conclusion that you are going to sell it – just look at the frenzy amongst the component distributors and the Chinese dev board makers who are all getting in on the Arduino act, and why is this? well the Arduino platform has made micro-controllers accessible to the masses, and I don’t mean made them easy to buy, I mean made them easy to use for people that would otherwise not be able to set up and use a complex development environment, toolset and language, and the Arduino designers also removed the need to have a special programmer/debugger tool, a simple USB port and a boot-loader means that with just a board and a USB cable and a simple development environment you are up and running which is really excellent. You are not going to do real-time data processing or high speed control systems with an Arduino because of its hardware abstraction but for many other things the Arduino is more than good enough ...

Now this is the part where the product team, executives and the board at Microchip should pay very close attention. I made contact with David Cuartielles who is Assistant Professor at Malmo University in Sweden, but more relevant here is that he is one of the Co-founders of the original Arduino project. I wrote David and asked him…

“I am curious to know what drove the adoption of the Atmel micro controllers for the Arduino platform? I ask that in the context of knowing PIC micro controllers and wondering with the rich on-board peripherals of the PIC family which would have been useful in the Arduino platform why you chose Atmel devices.”

David was very gracious and responded within a couple of hours. He responded with the following statement:

“The decision was simple, despite the fact that -back in 2005- there was no Atmel chip with USB on board unlike the 18F family from Microchip that had native USB on through hole chips, the Atmel compiler was not only free, but open source, what allowed migrating it to all main operating systems at no expense. Cross-platform cross-compilation was key for us, much more than on board peripherals.” ... I am clearly complaining about the crippling of Microchip provided compilers...why do they suppress their developer community with crippled compiler tool software unless you pay large $$$... " -- http://gerrysweeney.com/microchip-pic-chips-could-have-been-the-power-behind-arduino/

various comments in this thread say the same thing: https://www.facepunch.com/showthread.php?t=1502428&p=49577864#post49577864

"although Atmel have similar products like Microchip and even better open source software support, they sales are terrible hard to deal with. Many components prices go unexpected up and down as Atmel production capabilities are humble, once some big customer place large order for one chip they stop making others and this make impossible to use them for serious projects. Once you put AVR in your product it is not unlikely these chips suddenly to go on allocation due to the poor management and planning Altmel has, something which (almost) never happen to Microchip." -- https://olimex.wordpress.com/2016/01/20/you-guys-will-buy-your-avrs-from-microchip-from-now-on/

---

some comments in here suggest that hobbyists are moving away from 8bit PIC and AVR to ARM:

https://www.reddit.com/r/arduino/comments/422vvr/what_does_the_impending_acquisition_of_atmel_by/

also

---

a comment on AVR vs MSP430:

" tptacek 1 day ago

So first a couple caveats: (1) I'm sure AVR is a hell of a let better than PIC, (2) I come at this from a really weird place (exclusively emulators and compilers), and (3) I'm not talking about the AVR parts themselves, which might be more cost-effective for a given project.

That said:

Harvard architecture (split I & D memory)
8 bit registers
Not at all orthogonal, and particularly painful for pointer addressing
The stack pointer is annoying to work with
Address wrap at physical memory bounds, rather than the bounds of address space

I could pick a bunch more nits that would only really be relevant to someone writing an emulator (complicated instruction decode, IO addressing, &c) but those are my big complaints.

I find MSP430 much more pleasant to work in.

reply "

---

here's a guide to choosing crypto key lengths (this is relevant here b/c i'm obsessed with choosing good native bit widths for computers; EVM's use of 256 bits for partially crypto reasons makes me want to know what the bit width of various crypto systems is; i still like 16 bits though for a VM):

https://www.keylength.com/en/3/

---

the 'data' in each ipfs object is less than 256k

---

ARM cortex a32 (MMU-full but 32-bit)

http://arstechnica.com/gadgets/2016/02/arms-cortex-a32-is-a-tiny-cpu-for-wearables-and-raspberry-pi-like-boards/

---

Core manager (CM):  Extended Xtensa - LX4  Scheduling specific instruction set  32KB for code  64KB for data Processing Elements (PEs)  Xtensa - LX4 from Tensilica (now Cadence)  32KB for code  32KB for data Application Core (App)  570T core from Tensilica (now Cadence)  16KB cache for code  16KB cache for data

2 x 128MB DRAM

http://dsg.uwaterloo.ca/seminars/notes/2014-15/Lehner.pdf

--- " Accordingly, Raspberry Pi 3 is now on sale for $35 (the same price as the existing Raspberry Pi 2), featuring:

    A 1.2GHz 64-bit quad-core ARM Cortex-A53 CPU (~10x the performance of Raspberry Pi 1)
    Integrated 802.11n wireless LAN and Bluetooth 4.1
    Complete compatibility with Raspberry Pi 1 and 2... The 900MHz 32-bit quad-core ARM Cortex-A7 CPU complex has been replaced by a custom-hardened 1.2GHz 64-bit quad-core ARM Cortex-A53. Combining a 33% increase in clock speed with various architectural enhancements, this provides a 50-60% increase in performance in 32-bit mode versus Raspberry Pi 2, or roughly a factor of ten over the original Raspberry Pi. ... VideoCore? IV 3D is the only publicly documented 3D graphics core for ARM-based SoCs?, and we want to make Raspberry Pi more open over time, not less. BCM2837 runs most of the VideoCore? IV subsystem at 400MHz and the 3D core at 300MHz (versus 250MHz for earlier devices). "

schappim 8 hours ago

What has changed:

Wifi Built-in
IEEE 802.11b/g/n support
A maximum transmit and receive rate of 150 Mbps
Supports both 20 MHz and 40 MHz channels
Supports Infrastructure mode, Wi-Fi Direct and soft-AP mode
Supports WPA, WPA2 (802.11i), AES/TKIP, IEEE 802.1X and WAPI
A maximum transmission power 15 dbm
Operating Frequency 2.4 MHz – 2.497 MHz
Bluetooth Built-in
New System on chip (BCM2837 reported to be 1.5x speed of previous gen)
Chip is now 64bit ARMv8 QUAD Core 64bit processor
MicroSD? Card swapped out. Not more annoying spring ejection system that breaks.
Position of status LEDs has changed (there is a chip antenna where they used to be).
New power switching for 2.5Amp power supply.

What is the same:

Same USB/Ethernet controller
Same form factor
Same GPIO "

---

"The PC world roughly began in 1975 with the introduction of the MITS Altair 8800, based on INTEL's 1MHz 8080 8-bit microprocessor. "

---

http://atomthreads.com/

"Atomthreads is a free, lightweight, portable, real-time scheduler for embedded systems."

---

https://www.technologyreview.com/s/601263/why-a-chip-thats-bad-at-math-can-help-computers-tackle-harder-problems/

Singular Computing's S1 chip "In a simulated test using software that tracks objects such as cars in video, Singular’s approach was capable of processing frames almost 100 times faster than a conventional processor restricted to doing correct math—while using less than 2 percent as much power...Ask it to add 1 and 1 and you will get answers like 2.01 or 1.98. The Pentagon research agency DARPA funded the creation of Singular’s chip..."

---

in a discussion about radiation-resistant systems:

"What MCU are you using? Mainstream ARM are most likely not suitable at all. Better alternatives would be something like Freescale MPC56, TI TMS570 etc. These have lock-step cores, ECC and lots of error detection and redundancy implemented in hardware."

(tangentially, on designing systems for redundancy in this sort of environment:

https://stackoverflow.com/questions/36827659/compiling-an-application-for-use-in-highly-radioactive-environments

)

---

randyrand 2 days ago

Lots of misunderstanding in this comments section.

Movidius makes low power neural network processors for mobile application. The Myriad V1 is used in google tango and the V2 (what the USB stick has) is used in the new DJI Phantom 4.

http://www.theverge.com/2016/3/16/11242578/movidius-myriad-2...

The Myriad chips are interesting because they combine MIPI camera interface lanes on the same chip as a general purpose NN/CV processor and an SDK suite of hardware accelerated computer vision functions (edge detection, Guassian blur, etc).

here's the white paper for the chip: http://uploads.movidius.com/1441734401-Myriad-2-product-brie...

Because programming these chips essentially requires having the hardware, and because the hardware was very hard to come by, programming these chips was mostly limited to Google, DJI, and other big partners.

With this release the everyday developer has access to these vision processing chips, and the barrier to development entry is considerably lower.

This is not meant to replace your titan X gpu.

revelation 2 days ago

This is their own press release. What does that kind of hardware for CV primitives have to do with deep learning?

(Also, of course, this stick doesn't seem to have any kind of connectivity besides the USB to the host computer. How do I connect my camera? Having to shuffle the data from a camera to the stick passing the host computer somewhat defeats the point.)

krasin 2 days ago

>What does that kind of hardware for CV primitives have to do with deep learning?

They have hardware convolutions on 12 SHAVE cores (kind of a DSP core). It means that the chip can run some useful subset of convolutional neural networks very fast and energy efficient.

They also have 2 general purpose SPARC cores, which allows you to have a "normal" program running there. Not sure, how locked the USB stick going to be, and if running your custom program would be an option.

>How do I connect my camera? The chip itself has a couple of MIPI lanes. The USB stick likely does not expose that. And I agree, that's suboptimal.

---

some stuff mentioned in https://news.ycombinator.com/item?id=11777607 :

"...ESP8266 modules. No golang yet, but Arduino-C, Lua, Micropython, ...

here's micropython on ESP8266: https://github.com/micropython/micropython/releases/tag/v1.8

The ESP8266 is a line of very small (2cm) boards with wifi. They run a 32-bit RISC CPU: Tensilica Xtensa LX106 running at 80 MHz. They don't have an instruction cache [5] but they do have 64 KiB? of instruction RAM (in 2 32k banks, i think [http://www.danielcasner.org/guidelines-for-writing-code-for-the-esp8266/ ]?; this a harvard architecture so the 96k data RAM is separate).

WiPy?: https://www.pycom.io/solutions/py-boards/wipy/ https://www.kickstarter.com/projects/wipy/the-wipy-the-internet-of-things-taken-to-the-next

The WiPy? is a small (5cm) board that runs micropython and has wifi as well as various other interfaces. 256k RAM, CPU (MCU): Texas Instrument CC3200, Cortex-M4 @ 80MHz. Not sure what, if any, L1 cache it has.

http://www.acmesystems.it/arietta (5cm, ARM9 128M RAM; AT91SAM9G25 CPU, 16k icache, 16k dcache)

Arch Linux on ARM list of hardware platforms: https://archlinuxarm.org/platforms/ (lowest RAM in that list: 128GB)

https://wiki.openwrt.org/toh/unbranded/a5-v11 MIPS (MIPS 24KEc, 32-bit, i think MIPS32 release 2 [6] mb see [7]) with 4MB flash, 32MB RAM, 32k icache, 16k dcache

notes on ESP8266 and TI CC3200: https://blog.cesanta.com/esp8266-and-cc3200-how-we-made-them-work-on-our-iot-platform-presentation

---

the Transcend WiFi? SD card apparently uses a ARM926EJ-S (ARMv5) with 32k RAM. This model of ARM can have caches from 4k to 128k [8].

---

" Fermi and Kepler GPUs split 64 KB RAM between L1 and SMEM – Fermi GPUs ( CC 2.x ): 16:48 , 48:16 – Kepler GPUs ( CC 3.x ): 16:48 , 48:16 , 32:32 • Programmer can choose the split: – Default: 16 KB L1, 48 KB SMEM ...

Read-Only Cache An alternative to L1 when accessing DRAM – Also known as texture cache: all texture accesses use this cache

Caching is at 32 B granularity (L1, when caching operates at 128 B granularity)

Aggregate 48 KB per SM: 4 12-KB caches

" -- http://on-demand.gputechconf.com/gtc/2013/presentations/S3466-Programming-Guidelines-GPU-Architecture.pdf

" Constant Cache 8KB cache on each SM " -- http://courses.cms.caltech.edu/cs179/2016_lectures/cs179_2016_lec04.pdf

---

https://davidgf.net/page/41/e-ink-wifi-display

STM32F103ZE ARM CortexM?3 SoC? 64k RAM?

---

TI-85 graphic calculator:

RAM: 32kB, with ~28 kB for the user
CPU: Z80 6 MHz
ROM: 128 kB

TI-81:

RAM: ~2k user RAM (how much total?)
CPU: Z80 2 MHz

TI-84+:

RAM: 128 KB (24K user)
CPU: Z80 15 MHz

TI-89: * RAM: 256 KiB? * CPU: Motorola 68000 10 MHz

TI-Nspire:

RAM: 16MB
CPU: ARM 90Mhz

proj-oot-lowEndTargets-lowEndTargetsUnsorted

Misc

Hello, NodeMCU?.