Bayle Shanks's website: proj-oot-lowEndTargets-propellerXmos

about the Propeller and the XMOS:

" 1) Dispense with all those peripheral devices, UART, USB etc etc and use the space for flexible processing power instead. 2) Allow for lots of concurrent activity. 3) That lets us dispense with interrupts and instead we will have simpler predictable event driven programming. 4) Have deterministic timing. 5) Have tight coupling between the processors and the IO pins. 6) All this allows the general purpose processing power be used for creating those missing peripherals in software, "Software Defined Silicon". "

---

    Re: Parallax Propeller, why is it so overlooked?
    screamingtiger
    screamingtiger Top Member Aug 30, 2015 7:25 PM (in response to rwgast)

    Until recently there wasn't a good c compiler.  No one wants to use spin and the pasm is an odd ball.
        Like (0) Reply
        Actions
        Re: Parallax Propeller, why is it so overlooked?
        rwgast
        rwgast Aug 30, 2015 7:57 PM (in response to screamingtiger)

        What do you mean PASM is an odd ball? I ask because that is what im using to learn ASM, as I understand if you know one chip you pretty much can figure out ASM on any chip with a data sheet. Obviously the lack of interrupts are different, but what else?

        I do agree spin was the worst part of the chip, although it is a nice language it maybe the learning cure higher. The fact that you can make a ton of counters or eight spi busses is really cool I like how nothing except the UART/EEPROM/XTAL pins are special purpose.
            Like (0) Reply
            Actions
            Re: Parallax Propeller, why is it so overlooked?
            screamingtiger
            screamingtiger Top Member Aug 30, 2015 8:39 PM (in response to rwgast)

            Ive read numerous things.  Such as this from http://forums.parallax.com/discussion/132966/propeller-assembly-for-beginners

            1. There are 496 "registers" which may contain code or data. Certain constructs require operating on code as data (i.e. arrays in COG RAM).
            2. Pipelining means changes to instructions don't take effect until the next instruction.
            3. How JMPRET (aka JMP, CALL, RET) and DJNZ/TJNZ/TJZ work thus why you normally put an # before the label.
            4. Flags - having to specify which flags get updated, the power of conditional operations, and the limitations of only having two flags.

            After reading multiple odd ball things, I was waiting for the C compiler which was finally out of beta recently from GNU.

            Have you seen the new propeller hat for the raspberry pi?
                Like (0) Reply

        Actions

Re: Parallax Propeller, why is it so overlooked? shabaz shabaz Top Member Aug 30, 2015 11:33 PM (in response to rwgast)

Given there will be a learning curve with any device, if I had to choose between the two, based on current information I'd pick XMOS because the development environment has a C compiler, the process synchronisation is POSIX-like so familiar to software developers, the development boards are cheaper ($12.50, and is raspberry-pi compatible), the parts are more powerful and faster, and there is recent silicon so that provides roadmap assurance. XMOS is used in commercial designs, for example a lot of high-end audio devices use XMOS.

Not knocking Parallax Propeller, I don't know it well enough and I'm sure it must be in commercial designs too to keep Parallax in business, and it does look like an interesting part but XMOS are more interesting to me.

    Like (1) Reply

---

Re: Parallax Propeller, why is it so overlooked? crjeder crjeder Nov 2, 2015 5:59 AM (in response to rwgast)

    .. so why arduino?

Good question, IMHO. Why Propeller? Why anything else than ARM?

There is the market, you know.

I have seen so many brilliant designs die, despite them beeing much better as the dominant technology. Why everybody uses Intel CPUs not Sparc or Mips? Why has everybody nVidia graphic chips not BitBoys?? (does anybody remember them?) Part of the answer is that improvements on the dominant technology is much quicker than the developement on a budget in a niche. BitBoys? and Inmos Transputers for instance shared the same fate: their development cycle where so long that when their product finally hit the market nVidia / Intel respectively reached the same performance level in the meantime. So why switch? Consequently nobody did. Therefore they (BitBoys? and Inmos) had no money the stay in the race.

An other thing is that parallel programming never took of. The only widespread thing is multi tasking / multi threading i. e. running many INDEPENDENT programs in parallel. But who uses 100+ cores to solve a single problem? This is still the domain of science, not consumer grade.

So the real question is: Propeller, why is it still around? They must have found a niche which pays them enough to survive.

Just my 2ct...

    Like (0) Reply

Re: Parallax Propeller, why is it so overlooked? rwgast rwgast Aug 31, 2015 12:16 AM (in response to rwgast)

clemm are you talking about something like Zynq?

Ok so lets take this down to the hobbyist level Parallax has been around forever and they offer all there learning material FREE!! I am an electrician who wanted to learn electronics so I bought a friendly prop dip chip RTFM and stuck it on a bread board with some digi key parts and it worked. Next I soldered it to RatShack? prefboard and added ps/2 vga and SD connectors! I mean 8 cores this sounded cool and I could use it to start learning electronics!

Parallax has some boards like the quick start which are as much as an Uno, so why arduino? Why not an 8 core mcu thats backed buy a company thats made basic stamps since the 90s? SPIN shouldnt even be the issue here since it has constructs of languages like basic/python which is much easier for a coder that doesnt know C, like most Arduinoists...

I am really in to SDR right now and there are examples of using the props counters only to make an AM receiver, so I can imagine the power of xmos! But what Im really in to is for an awesome SDR is the new Parallela board with the arm cpu and the 16 cores+fpga. Im trying to make the PsoC? line my new chip for smaller projects that don't need parallelism over the AVR or another ARM core. I mean I can replace some complete boards with a PSoC?5.... great for small test tools!

Got a little off topic though why Arduino and not the propeller? Obviously Xmos is more powerfull but not hobbyist friendly or should I say newbie friendly. BTW I got in to uC's 5 years ago and at that time it was kind of like the prop basic stamp or arduino and the prop had more power than an atmeaga and was just as easy to work with software/custom hardware side.

Off topic can someone tell me how to mark a response helpfull??!

    Like (0) Reply

        Actions

Re: Parallax Propeller, why is it so overlooked? michaelkellett michaelkellett Top Member Aug 31, 2015 2:42 AM (in response to rwgast)

The Propeller chip is unpopular for so may reasons: it is very limited compared with ARM Cortex and far from cheap.

The 8 processor model is difficult to design with and this puts people off but there are worse features to come - each processor has a very limited amount of memory (512 words is tiny) and access to a bigger common memory (still only a tiny 8k words) is time sliced and so very slow. The next big issue is that the chip has effectively no on-chip peripherals. A typical cheap ARM Cortex will have SPI, I2C, UARTS, timers, USB etc etc.

Several companies have attempted to dump peripherals and code them all in software - it's never popular because it's hard work and difficult to support in practice and always gives worse performance than dedicated peripheral functions. XMOS are currently plowing this furrow and have found a few niche applications but aren't breaking into the mainstream (they've inherited too much attitude from Inmos so I don't think they ever will make the big time.)

So while the Propeller is interesting, and can give good performance if you get a job that is a really good fit, it isn't a patch on general purpose micros for general purpose jobs - and they make up most of the market. For niche stuff DSPs are much better at DSP, FPGAs are much better at massively and can be clocked so much faster etc etc.

So right now I can buy an ARM Cortex with a 180MHz clock 1Mbyte Flash, >192kBytes RAM, on chip USB, Ethernet, Encryption/Decryption etc etc for about the same price - the Propeller just doesn't offer anything expect less stuff and more hassle.

    Like (3) Reply

Re: Parallax Propeller, why is it so overlooked? screamingtiger screamingtiger Top Member Nov 2, 2015 8:47 AM (in response to crjeder)

Don't forget if it is implemented in software, it can conflict with resources since it will usually use one or more of the built in timers, ISRs etc..

    Like (0) Reply
    Actions

    Actions

    Re: Parallax Propeller, why is it so overlooked?
    rwgast
    rwgast Nov 2, 2015 9:09 AM (in response to screamingtiger)

    The propeller can bitbang spi @ 10mhz+ and the Uart is 2Mbaud... It doesn't have USB and a lot of other cool peripherals but theHALre coming in the prop 2. I personally think using a dedicated chip works just as well. I think ARM has taken off, it has the Arduino IDE on some chips and EMBed.. I think it would be more popular if the chips were easier across to port HAL layers across vendors and cortex 0/1/2/3/4 architectures. I have plenty of ARM boards which wont run Embed.

    I guess I think of the propeller as having 8 little arduinos in one chip with internal lines to talk to each other. There is no need for interupts on the propeller architecture, but if you must have them they have been implemented in software. The propeller also has 16 hardware based counters which are fairly useful, Ive been reading a piece of software that uses 3 counters to receive RF and mix the IQ signals and output audio. This is all done with just an inductor antenna and a few other passives to make the props sigma delta A to D work. Which btw is very scale able up to 16 bits (most likely higher if proper care is taken during layout), its a resolution vs speed trade off.

    The chip is very capable It is niche but it's niche works very well for hobby robotics and automation. Its also much easier to program without interrupts the deterministic nature is great... you can count on things getting done in a certain amount of time without interrupts disturbing them. Im not trying to sound like a fan boy, I guess the real reason I wonder why the arduino took off over the propeller is because Parallax already had the hobby market with the basic stamp, so why did hobbyist pick a 16mhz single core mcu over an 80mhz 8 core mcu? I personally think the PSOC chips are great and wish they actually had the arduino market they are easy to use and the digital fabric allows you to get your feet wet with an HDL when your ready, on top of being an ARM core.
        Like (0) Reply
        Actions
        Re: Parallax Propeller, why is it so overlooked?
        crjeder
        crjeder Nov 2, 2015 9:40 AM (in response to rwgast)

        No problem with fan boys

        I decided to go mainsteam (ARM) instead of niche (Xmos) despite I like the solutions in the niche products. PSoC and Xmos are very intresting (have dev boards lying aroud but no time to give them a closer look)

        In the past I had an Inmos Transputer board which got me into programming but turned out to be not relevant on the market. Same with parallel programming, tile based rendering and many other technologies I had great hopes for. For hobby it is ok to look besides the mainstream, but if you have to make your living it's hard.
            Like (0) Reply

Re: Parallax Propeller, why is it so overlooked? michaelkellett michaelkellett Top Member Nov 2, 2015 9:52 AM (in response to crjeder)

For my work stuff I use mainly ARM based micros with FPGAs for the hardware gymnastics (when required).

I rather like the idea of all the esoteric chips but they've never looked convincing enough to actually use (Propellor, Parelella, XMOS, GreenArrays? GA144 etc)

re. Joey's comment on I2C - I do often bit bang I2C on micros because I've had so much trouble with the built in stuff in the past. But it does mean that the micro is busy all the time it's I2C-ing.

    Like (1) Reply

---

Re: Parallax Propeller, why is it so overlooked? rwgast rwgast Nov 2, 2015 11:18 AM (in response to michaelkellett)

michaelkellett the cool thing about a propeller or even an xmos is the abaility to bit bit bang protocalls out in 1 core/cog (im pretty sure you can get 20mhz spi in pasm) and still have room left over to do other tasks in that core.

I really have my eye on the paralella boards... with there zynq and the multi core they look awesome for sdr and other intensive applications. x

    Like (0) Reply
    Actions

    Re: Parallax Propeller, why is it so overlooked?
    michaelkellett
    michaelkellett Top Member Nov 2, 2015 12:11 PM (in response to rwgast)

    Yep, that's what I meant when I said earlier that the XMOS has "special support logic" - it (and some other processors) are much better at bit banging than your normal micro.

    Unfortunately the price that is paid for that (in the widest sense) has (so far) kept such processors out of the mainstream.

    There's been a thread on E14 recently talking about the Snickerdoodle (I wish they had thought of  a less silly name) which is an even cheaper way of getting a Zynq (but no array processor).

    MK
        Like (0) Reply

Re: Parallax Propeller, why is it so overlooked? rwgast rwgast Nov 2, 2015 3:27 PM (in response to DAB)

@DAB

Ya there is some neat things that can be done with it... I mean you have 16 counter that are independent hardware implementations, a 2MegaBaud? UART and as I mentioned a scalable sigma delta ADC which is a speed vs resolution type of idea, this only requires careful layout and and passives. You can probably get a 20bit ADC with a dead bug layout if you dont want to design a mixed signal PCB for a manufacturing. I choose the propeller when I want pure determinism and speed, and do not mind laying out special peripheral chips as needed. Plop a cypress FX2 on the pcb and now you have 480Mbit USB, and a wiznet or ESP2886 and you have wifi. A few differential/instramention amps and bob is your uncle! Most of the time this works well because Im not going to wack a 24 bit dac on an atmega328. If determinism isnt very important and I can get buy with lack luster peripherals I will go with the AVR every time. I dont need a low noise hi res dac to display a battery voltage on a 16x2 lcd this is where AVR works great and it will drive a few relays to switch battery pack banks in and out with general purpose power to spare!

My personal system is for my robot has a few SRAM and i2C EEPROMs on board along with a propeller chip and an atmega 328 (acually two one 328 runs that battery system, 16 2400ah NiMH? split into eight cells and eight 8 cells. One 8 cell set runs motors the other 8 cell set runs everything else, the idea was to get the motor transients isolated all the way back to one single ground point between systems, directly on the battery pack.) The idea of the 328 connected to the propeller chip has an a 4 line bus between the the two chips, with jumpers on each connection (this supports i2c,spi, or 4 bit parallel com bus). This is so I can use the 328 as a pin extender its function is to read all the sensors buffer the data in to its sram and then the propeller poles the AVRs stored values. The propeller is reading the quad encoders and running motor control chips. Then the rest of its cogs are there to do make calculations on the data and move the robot in the appropriate way. This bot will eventually have 2 custom made laser scanners (using wii mote cameras) and an SBC with a CAM running openCV. The SBC will crunch all the measurement data from the lasers and openCV then send that off to the propeller who will think about the right move to make combining the SBC data along with the 328's sonar and bump sensor data. Eventually SLAM will come to play, and maybe Parallela! But this is a clear case for the propeller. For those of you who haven't seen what it can do I figured Id post some unique things, that I dont think other micros can do very easily

http://forums.parallax.com/discussion/105674/hook-an-antenna-to-your-propeller-and-listen-to-the-radio-new-shortwave-pro… http://forums.parallax.com/discussion/105674/hook-an-antenna-to-your-propeller-and-listen-to-the-radio-new-shortwave-prog/p1

http://forums.parallax.com/discussion/129652/the-lassiter-inexpensive-lidar-distance-detection-ranging-propeller-wiimote

http://forums.parallax.com/discussion/131954/my-attempt-at-23k256-sram-drivers-includes-8-bit-version

Triage Training System

learn.parallax.com

There is all kinds of stuff 6502, 780 emulators that are deterministic, projects running CP/M on the prop etc... these links kind of show the diversity, there are alot of industerial control and RF projects out there too

    Like (0) Reply

---

http://forums.parallax.com/discussion/133447/xmos-chips-vs-p2/

User avatar mikef Member Posts: 15 Joined: Sun Dec 13, 2009 3:17 am Contact: Contact mikef

    Quote

Postby mikef » Sun Dec 13, 2009 5:30 pm RE: Propeller vs. XMOS I've heard of it, even looked at it briefly. I've never been a fan of Parallax stuff, so I'm a bit biased from the get go. I'm not a fan of the Windows-only development environment, or the fact that it is only supported by a custom, proprietary language. With the XMOS, I've got a cross-platform C environment that I can pull code into from previous projects, no relearning languages, just have to learn the few XC keywords and a few rules. It's a much more industrial/professional toolchain, of the type I'm used to.

jmg Posts: 10,579 August 2011 edited August 2011 0

    Heater. wrote: »
    ....
    1) Dispense with all those peripheral devices, UART, USB etc etc and use the space for flexible processing power instead.
    2) Allow for lots of concurrent activity.
    3) That lets us dispense with interrupts and instead we will have simpler predictable event driven programming.
    4) Have deterministic timing.
    5) Have tight coupling between the processors and the IO pins.
    6) All this allows the general purpose processing power be used for creating those missing peripherals in software, "Software Defined Silicon".

This brings me to one of my main peeves with this class of device (applies to Xmos and Prop) :

It seems that 'mindset' made the designers lazy, and the unfortunate end result, is performance that is much lower than the Silicon could support, often for want of some simple scaling & better connection to the peripherals that are already there.

Take timing - both XMOS and Prop have mutiple timers, but can they capture the time of a pin edge, to (eg) 32 bits, to the system clock precision ? (like a sub $2 ARM can ?) Can they prescale any pin, before doing such a capture ? [ Prop AN001 only mentions Capture once, and that is a SW read ]

Or PWM generation precision ? More uC can now do this to 1ns, or 300ps or even 150ps edge precisions, and timer clocks faster than the core clock are more prevelant. This saves a lot of system power.

Examples that 'push to the corners', are also sparse, so a possible new user, has to do a lot of trawling...

The stock reply is "do this in sw", but now, the precision falls to that of a SW loop - yet I can see timers, and a fast core, and am left thinking "What if..."

So we find instead a choice of a mid-level CPLD, and more generic micro, can out perform the chips that claim to be specialised ?

jmg Posts: 10,579 August 2011 edited August 2011 0

    Leon wrote: »
    These XMOS documents might help:
    https://www.xmos.com/download/public/Programming-XC-on-XMOS-Devices%281%29.pdf
    https://www.xmos.com/download/public/XS1-Ports-Specification%281.02%29.pdf
    I/Os can handle 100 MHz events.

Yes, already have those. The 'handle 100MHz' has caveats, as I said, capture an edge to 32 bits, and you find the XMOS part has only 16b port timers, or you find they are data-streaming captures, which choke the SW above a certain (rather low) frequency/edge rate.

I guess they focused on data-streaming apps, and forgot about more general timing problems. Which is a shame, as all the blocks are there on silicon, just poorly conceived connections driven by a belief SW can solve anything...

rod1963 Posts: 750 August 2011 edited August 2011 0 To answer Andre's troll bait the reason we don't see Xmos game consoles or some sort of media box is because there are better chips than either the Prop or Xmos processors for that sort of application. Xmos knows this and they don't even bother targeting something like game boxes or set-tops. Get seriously into that territory then you butt heads with TI's OMAP and other SOC/GPU solutions by the big players. Even the P2 won't be in it's league.

Ale Ale Posts: 2,299 August 2011 edited August 2011 0 Pro XMOS (I have only programmed the G4 devices!, as I haven't tested my L1 board yet)... plenty of power, I mean the threads run FAST !, the assembler is not that difficult (I'd like some more registers but well...), timers and ports are quite high-level. Ports are buffered!, channels are buffered, that makes high speed communication between threads (also in another core) possible and just transparent (how the channels work and the instructions provided to access them). (I'd love something like that for the prop). But, making your own board is no way as easy as with the propeller. The prop is simpler, works with far less (and no, they are IMHO not comparable!). The P2 could then be compared to current XMOS devices, sadly we do not have yet any P2 .

Bill Henning Posts: 6,445 August 2011 edited August 2011 0 XMOS advantages:

clocked buffered I/O makes some things VERY easy
more MIPS
channels between cores/chips
a G4 effectively has 16 x 100Mhz "threads" roughly equivalent to cogs, or 32 not quite as deterministic roughly 50Mhz threads

P2 advantages (based on available preliminary data)

ADC & DAC on every pin
"saner" I/O mapping
>100Mhz I/O
easy video generation
CLUT
8x 160Mhz (estimated) hard deterministic cogs

Until I can try both side by side, that is the best I can do for a tech comparison

Both will be a pain to prototype with (unless using breakout boards)

Both will have a place in the market, and most people will choose the processor based on how well it fits their application/budget.

Leon Posts: 7,619 August 2011 edited August 2011 0 XMOS devices are primarily intended for use by professionals, but also appeal to advanced hobbyists and university students. The Propeller is the other way round, a device that was originally designed for hobbyists that Parallax is now attempting to market more widely to professionals.

The two companies are similar in some ways - they have about the same number of staff (50 or so) and both devices were designed by two people. Leon Heller G1HSM

K2 Posts: 519 August 2011 edited August 2011 0 Good answer potatohead.

I have had the XMOS IDE and XC on my own machine before. For really simple things, it is no more difficult to use than an ARM. But if you think you can easily exploit the features that make XMOS uniquely XMOS, you are sniffing glue. One can rattle off a whole list of features, but they are all meaningless if you can't make them work. I know specific cases where very intelligent and motivated individuals wasted months of their lives on XMOS and finally gave up in disgust.

Full disclosure should be a moral imperative.

Sal Ammoniac Posts: 213 August 2011 edited August 2011 0

    K2 wrote: »
    But if you think you can easily exploit the features that make XMOS uniquely XMOS, you are sniffing glue. One can rattle off a whole list of features, but they are all meaningless if you can't make them work. I know specific cases where very intelligent and motivated individuals wasted months of their lives on XMOS and finally gave up in disgust.

Details, please.

My experience with XMOS has been the opposite. I find the tools easy to use and have not had any trouble using the features that make XMOS unique. I even program it in its assembly language, which I find rather easy to use.

kuba Posts: 39 August 2011 edited August 2011 0 The major conceptual difference between XMOS and Propeller is that in XMOS, your code needs to be fast enough, and that's it. In Propeller, your code has to have exact timings. In XS1 XMOS architecture, the hardware does timings for you, and kicks your thread into action when the right time comes. This usually lets you forget about cycle counting, and I, for one, think it's high time someone came up with this. I've done enough cycle counting in my life -- on SX48 and on Z8 Encore!

The supposed "non-determinism" of the XMOS architecture is somewhat imaginary sort of a problem. Imagine you've coded things for Propeller. Then you decide to have your clock go faster. You usually have to recode things, or at least tweak the code heavily. Nobody will insert NOPs for you if the clock goes faster. On XMOS, if your code runs on a slower device, it will run on any faster device, too, as long as you don't hardcode various clock divisors but slave them from a global clock speed. ... Adding to the XMOS's determinism is their timing analysis tool (XTA) -- this lets you do static proofs that certain code will execute within a given time. This guarantees that your code will work as planned on the thread(s) you chose, and on any faster thread(s), too.

I do agree that XMOS documentation is fragmented, and some things are not clearly spelled out and that sets you up for initial frustrations. ...

kuba Posts: 39 August 2011 edited August 2011 0 There is an aspect of hardware-enforced "don't shoot yourself in the foot" that's perhaps hard to come by on most other architectures. Propeller's unique design forces communication between cogs via the hub and that's it. You have shared semaphores, but apart from it you're free to devise any length of rope to hang yourself with. Say, a pin read-modify-write race between cogs.

On XS1, the threads similarly have sequential access to RAM. Access to some resources is exclusive to a thread, and that's enforced by the hardware that will raise an exception -- so you can't cheat even in assembly. Some resources can be shared.

Highlights of hardware-generated exceptions are below. All of those help at catching software bugs and some are fairly unique to XS1 and make it a more robust platform.

trying to send interconnect (link) control tokens in software (analogous to preventing spoofing ethernet packets)
illegal PC counter value (on many MCUs, you can execute in lala land as long as the default value in unmapped memory happens to be a valid opcode)
illegal resource (use of invalid resource identifier/handle, akin to trying to read from a closed file handle), this is methinks unique to XS1 and no other architecture has anything like that
access to undefined registers (say processor status) - instead of returning some "default" value, it's correctly caught
resource dependency, when one thread tries to access the same resource within 4 clock cycles of another thread, this can catch race conditions.

kuba Posts: 39 August 2011 edited August 2011 0 In the nutshell: Propeller does have WAITCNT and WAITVID, on XS1 you have one WAIT instruction that you can use to wait for any combination of events from various peripherals. And that's only the beginning.

XS1 has a fairly powerful software-controlled interrupt vectoring system. The interrupt vectors are not permanently assigned to peripherals, like in many MCUs. Instead, you can assign any vector to any event-generating peripheral (I/O port, etc). When the event happens, the vector points to the next instruction to be scheduled for given thread. A vector is specific to a thread, so you have full thread affinity for responding to external events.

The classical problem of what to do if different events all reuse same interrupt handler (vector) is handled very nicely, too. Normally you have to interrogate status bits to know what happened, if the handler could be triggered by different things. On XS1, to each event source you assign a so-called environment vector. It's simply a data word that's available in your interrupt/event handler, and lets you adjust your logic according to interrupt source. You can use it as a bitmask, as a jump offset or table offset, or whatever suits your application. I haven't seen anything like that on any of the mainstream MCUs -- feel free to correct me if I'm wrong. You normally have to emulate this by setting up code to write a value somewhere, then jump to a common handler code. This costs precious cycles. On XS1, an event/interrupt handler can be done in a couple thread cycles -- say in 80ns. That's less than one clock period on some MCUs.

The major difference between XS1 and Propeller (P1 and P2 both) is that Propeller has no interrupt support at all. On XS1, event/interrupt support enables essentially free event-driven switch statements. You can wait on many things to happen, and there's no time penalty for that. Waiting on one event is no different from waiting on 10 events, in terms of latency. Of course if two things happen at the same time, you can't process them concurrently in the same thread, but at least your code doesn't get any slower from trying to wait on many things in the first place. This is IMHO a very sane design decision.

The difference between events and interrupts is fairly simple on XS1. An Event handler does not preserve the PC. You have to be within a WAIT instruction for an event to fire. It is like a hardware-driven switch statement. You have sole control of the execution path after you're done handling an event. An Interrupt does the usual automagical PC/status storage in registers dedicated for that purpose, so there's no memory access overhead for that.

...

P2 wins IMHO in terms of the usefulness and capability of its I/O pins and counters (plentiful!). It's completely unique in that sense. There are plenty of interfaces where you don't need anything besides what P2 would give to you. P2s major shortcoming could be the impedance mismatch between on-chip and off-chip communications: HUB access only works within a chip.

XS1 wins in terms of flexibility of its architecture: ease of communicating multiple cores, both on- and off-chip, the more powerful instruction set, more flexible memory architecture, and hardware resource management (resources: timers, clocks, channels/channel ends, ports). Contrary to P2, communicating between threads on a single core is no different than communicating between threads on two chips with a couple of XS-link switches in between them. If you want to cheat within a single core, you can always share RAM areas, with a big CAVEAT EMPTOR attached.

At the moment, the programming tools are IMHO somewhat tied between both platforms (of course I only judge what's out for P1). Viewport debugger is IMHO a good enough counterbalance to lack of high-level languages beyond SPIN, when you compare to higher level languages available for XS1 (C, C++ and XC).

XS1 and P2 will be, overall, very similarly performing devices, apart from cases where device-specific features will be of a make-or-break kind. I'd personally not attempt to code a complex industrial communications protocols in PASM or SPIN, but I'd gladly use P2 in a controller/sensor application as an I/O front-end and fixed-function DSP processor. The logic and networking would be probably better handled in XC on XS1.

There are some applications where P2 seems like a perfect fit, say filtering of modulator data from multi-channel sigma-delta converters like TIs ADS1278. One COG per channel, at 160MIPS, would be more than plenty to massage the data if the on-chip filters don't suit you. Of course, a "standard" way of doing it is within a large DSP or an FPGA, but P2 would be probably more cost effective, at least for R&D and small runs.

For some other applications, XS1 is a perfect fit. Implementing an EtherCat? slave (fast, 100mbit ethernet-based industrial communications protocol) turned out to be fairly easy on the XMOS device -- and here we're replacing a custom ASIC or FPGA IP with a purely software solution. Adding goodies like ethernet-over-ethercat was simple once you got a hang of it. A stand-alone fixed-latency realtime ethernet switch was easy to do as well. I'd be probably in deep dodoland trying to squeeze it all into COGs, even on a yet-to-emerge P2. On XS1-G4, I could partition threads across cores to evenly spread out RAM use across the cores, getting most out of the 64kb of shared data/code RAM.

Heater. Posts: 19,798 August 2011 edited August 2011 0 This is similar to the debate about HUB access timing on the Prop. Many have suggested that if some cogs are not in use the the HUB switch should skip them and hence give more frequent access to HUB RAM for those that are running. Or that there should be a mechanism to give more HUB access slots to selected cogs regardless of what else is running. And other variations on this theme. My take is that is a complication that may be usefull in some cases but in general messes up the determinism of the Prop. No longer could you just throw objects from OBEX or elsewhere into your project and be sure that they don't have bad timing interactions with each other. I'm inclined to make the same case against such priority schemes in xcore thread scheduling. Possibly usefull for some cases where you want to push the thing over the current limits by a little but in general messy and complicating.

Heater. Heater. Posts: 19,798 August 2011 edited August 2011 0 Has anyone considered how magical the coginit PASM instruction is?

In one instruction we can: 1) Find a spare processor to run some code on. 2) Return an error if there is none left. 3) Load the required code into the core. 4) Give it a parameter, which in turn can point to a whole array of parameters 5) Start the core executing.

Never really though about it before but is there any other processor that has such an instruction?

The xmos devices don't not have such a thing. Sure they can start threads but it's for sure not a single instruction affair. Also threads on xmos cannot arbitrarily start child threads on other cores. Cheifly because cores do not share RAM I guess.

... The major win for XS1 is, IMHO, that you can respond to multiple events from a single thread with a fixed latency that has jitter within 10ns p-p. This is not possible on a single COG in Propeller. Say you wait for a dozen events. In one COG, you can only have a loop that checks various event sources in order, and dispatches based on that. The latency in event response depends on where in the loop your code happened to be when the event arrived. Of course if you only have a few events to wait on, you spread them out to individual COGs. On XS1, you can wait for as many events as can be provided by the hardware, and the latency for a thread to react to the event is fixed in terms of thread cycles (but not in terms of core clock). Port timers let you issue a response that's timed based on the time the event came in, in terms of 100MHz reference clock, so you can have very short, fixed latencies, measured in multiples of 10s of nanoseconds. The jitter is well under +/-10ns. For simple event handlers, you can have a reaction in a few tens of nanoseconds, completely deterministic. ...

 For Propeller I, all you really need is their manual and datasheet. That's where Prop's documentation shines, I admit, but Prop is a conceptually MUCH simpler architecture. That's not necessarily a weakness if you need to be up to speed quickly, but can be a weakness when you run into the limitations of Prop's architecture.

...

kuba kuba Posts: 39 August 2011 edited August 2011 0

    Heater. wrote: »
    My take is that is a complication that may be usefull in some cases but in general messes up the determinism of the Prop. No longer could you just throw objects from OBEX or elsewhere into your project and be sure that they don't have bad timing interactions with each other.

It all depends on the programming style. If the object you're using is coded to depend on instruction timings, then yes, of course you're out of luck. This kind of coding is precisely what's NOT the way to do it on XS1, and would be IMHO the biggest mental leap to be taken when transitioning from a Prop or SX48 to XS1.

Even on the Prop, you could avoid it: sprinkle your code with WAITxxx instructions, so that every input or output happens at a predetermined time, and you don't have the problem anymore -- just like that. It may seem tedious on the Prop (or PIC/SX48, etc) since neither Spin nor any other high-level language that I know of lets you do it without the tedium. On a Prop it also will slow your code down, since the extra waits are not free.

On XS1, though, executing an input or output at a preset time (or with a timestamp) is a single line affair in the XC language, and it is free in the sense that the hardware timers drive the ports and you don't need any software for that, other than programming the registers to desired values.

One could port XC to Prop or other deterministic architectures (SX48, PIC12, eZ8) to provide this functionality in a high-level language setting, but the select statement that waits on multiple events can only be approximated in a tight loop since Prop cannot wait on more than one thing at once in a single COG. On architectures with interrupt support, like eZ8, there is a reasonably good approximation for XS1's WAIT instruction (or XC's select statement): you set up interrupt handlers, then enable interrupts and spin in a tight HALT loop. This usually provides for fixed latency (at the level of CPU clock cycles) between the occurrence of an external event, and your code handling it. At least on eZ8 it works as you'd expect it to. On many of my projects done on eZ8, the main loop is, in C, simply while (1) { asm("HALT"); }

personally would be most happy if PII's analog circuitry for the pin drivers was integrated on XMOS die. I don't care much for Propeller's code-executing core, but the circuitry supplied with each pin driver is nothing to sneer at. On the contrary, it's probably why many of my projects will use PII as soon as it comes out for some fixed-function filtering, and for interfacing with the world out there. The XMOS pins are your bog-standard digital I/O lines, not even very modern ones in the sense that buffered ports are pretty much stuck at having all constituent bits being either inputs or outputs, not individually configurable. So if you have a 4 bit port, there's no way AFAIK to change directions of individual pins. Of course you have plenty of single-bit ports, and if you want, you can collect all related outputs in a single, wider port, as long as you only access it from one thread at a time so that the resource protection exception won't undercut you.

...

The ease in which the code is compartmentalized and re-used on XS1 is same as on Prop, assuming the code was written by someone with understanding of the architecture. A lot of comments are done by people who have no such understanding, unfortunately. XS1 has only 64kb of RAM per core, so it's limited not unlike Prop is. Yet the benefit of XS1 is that you are not "wasting" RAM on a COG/thread -- any unused RAM is available to other threads.

...

I'm using XMOS-provided "objects" without any issues, they are simply given with a spec as to how many threads they use and how many MIPS the threads need to have available, and what is the format of data that I exchange with them. I communicate with the "objects" via communication channels, and can do so from threads running on physically separate chips, too. On a Propeller you have to communicate via the HUB, so if the code you run doesn't share a HUB with the object you're using, you need to devote a yet another COG to communicating. On XS1, the hardware handles communications between cores, and there's no difference between communicating between cores on same chip vs. cores on physically separate chips. The latter of course need some pins, but that's same with Prop.

...

To give just one example of excellent code reuse on XS1: their USB library, called XUD, provides an entire USB device stack in a single thread. It requires your application to provide one thread to handle each device endpoint, and communicates with them via channels and shared memory variables. The shared memory is needed to support the bandwidth called for by high-speed (480MHz) USB. The setup of the library is very simple, all you need is to correctly attach the PHY to your XS1-L or XS1-G chip, and to link your project with the library. The API provided is as simplistic as it gets. I've got it to work with no problems other than bugs in my code related to the device-specific functionality.

I have been able to tweak this library to limit it to full speed (48MHz), and then it doesn't even need any shared memory for communications, the endpoint threads can be elsewhere on the network. For example the control endpoint (EP0) can be on one chip, whereas the bulk endpoint that streams the data can be on another chip, some inches away from the USB PHY and adjacent XS1 device.

...

This is all to be taken with a bucket of salt but the result was that whilst XMOS C and Catalina generate about the same number of instructions the XMOS binary code is about 50% of the size. This is mostly due to X using compact 16 bit instructions rather than the 32 bit that the Propeller requires.

If you want compact code on the Prop you need to go to Spin and take the performance hit of an interpreter.

...

Bill Henning wrote: » XMOS advantages:

clocked buffered I/O makes some things VERY easy
more MIPS
channels between cores/chips
a G4 effectively has 16 x 100Mhz "threads" roughly equivalent to cogs, or 32 not quite as deterministic roughly 50Mhz threads

P2 advantages (based on available preliminary data)

ADC & DAC on every pin
"saner" I/O mapping
>100Mhz I/O
easy video generation
CLUT
8x 160Mhz (estimated) hard deterministic cogs

...

Both will be a pain to prototype with (unless using breakout boards)

...

jamisonman Posts: 2 June 2012 edited June 2012 0 I wasted a lot of time with XMOS, at first it appears great and it's built on C so it shouldn't take much to get good at it. I was WRONG uggg, I can't get over the way they implement there I/O structure, it's just awful and un-intuitive to me. Plus you should look at some of the other aspects of the XC language like "timers". I remember first reading their tutorials and of coarse eventually understanding what they were saying, but definitely I kept coming across basic stuff that I had already learned and having to stop and be like "wait, what the heck is going on here again". That is bad, cause once you pick it up you need to be able to move on quickly and not have to go over the same things again and again as complexity is introduced to the basics. Uggg, stay away from XMOS unless they are the only option. And don't expect to pick it up and get rollin like with an arduino. The people at Arduino did it right, let me tell you. Parallax isn't bad either but kinda slow and costs too much for the value you actually get from it.

Heater. Posts: 19,798 June 2012 edited June 2012 0 jamisonman,

    I can't get over the way they implement there I/O structure, it's just awful and un-intuitive to me. Plus you should look at some of the other aspects of the XC language like "timers".

In the interests of a factual and infromative comparison between devices could you please give details of: 1) What is awful about the XMOS I/O structure? 2) What's the problem with XMOS timers?

Just saying things are bad does not really help unless it's pointed out why. From my, admittedly limited, playing with X devices I/O and timers worked as advertised. Which is not to say there are some features that can frustrate like:

1) Having pins grouped into ports, you have to set a whole port width of pins to input or output together. Which makes randomly tacking devices onto an XMOS trickier than the Prop where all pins are independent and equal.

2) Each XMOS core has it's own set of pins, cores cannot use pins on other cores. Again this can make allocating resources a bit more tricky and again it makes randomly tacking devices onto an XMOS harder.

All in all it makes life harder for the hobbyist tinkerer.

    Uggg, stay away from XMOS unless they are the only option. And don't expect to pick it up and get rollin like with an arduino.

That's odd. I found one could get a "hello world" program up on X as easily as you can on Arduino or the Prop, or flash LEDS and such. Yes of course getting the most out of an XMOS, or a Prop requires a lot of study but then that's true of most devices.

...

https://web.archive.org/web/20110311073336/http://grieg.gotdns.com/blog/?p=207

But last September, I was busy pounding my head against a Xilinx Spartan-3AN FPGA development board (and given its numerous header pins, this wasn’t particularly pleasant).

What I needed was an easy way to run four independent 32-bit counters at 100+ Mhz, to communicate with eight discrete serial ADCs, to perform multiple floating-point calculations, and to return all of the resulting data to a control PC using a single high-speed serial line.

I was told, and still believe, that this should be no problem for an FPGA like the Spartan-3AN. Unfortunately, prior to this project, I had no experience with VHDL, Verilog, or anything having to do with FPGAs or CPLDs (besides the blessed CompactRIO?). So needless to say, I was having all sorts of weird problems. ...

I was running out of time and needed a solution fast. ... And boy am I glad I did. Once I got to writing code for the XMOS, I was an instant fan. ... This device was unlike anything I’d worked with in the past. And yet, I could code it in C or the C-like XC language (which supports parallel processing). This fact is what really set my mind at ease. The only slight issue was the lack of standard hardware modules you’d find in most microcontrollers; the XS1-G4 has no built-in hardware to transmit serial data, measure analog inputs, or generate PWM signals. But hey, neither do FPGAs. The plus here is that the XMOS has so much horsepower, you can implement anything in software.

The XC-2 development kit I received even came pre-programmed with a fancy web server demo application.

... why XMOS so important to all folks in the forum? ... the fact you can have 8 threads running on a core at the same time at 50MIPS without having to get into interrupt programming or using a complicated operating system on the chip.

Could be the ease of communication between multiple XMOS chips. No complicated hardware required or complicated comms software to write.

Could be the timing determinism of the I/O ports and program execution. Not to mention the speed of I/O being massively faster than the devices you mention.

....

Postby mio » Sat Dec 12, 2009 5:52 pm I had interest in what Heater said, too, and began to use XMOS. But, to tell the truth, there is some discontent in present XMOS.

1.You must use precious 1 thread in 4 thread for the simple structure like UART. 2.Communication with channel is slower than I thought before use of it. 3.The size of SRAM is not sufficient because code and data must be shared.

However, I think that XMOS has such a charm that it isn't anxious about these. :P

...

Postby mio » Sun Dec 13, 2009 2:47 am

    Heater wrote:
    I must admit that using a 32 bit CPU for a UART or in the XMOS case a precious thread does seem to be very wasteful.
    However you can often find that something simple like a UART can be combined with one or more other simple things, depending on speed requirements, or even multiple UARTS in a single thread.
    This kind of flexibility starts to win you over the typical MCU where "oh dear we've run out of built in hardware
    gizmos means a choosing a different member that that MCU family.
    Anyway don't we have 8 threads per core to play with? That makes things look much better.

Yes, I partly agree.

XMOS give me "select-case" waiting for multiple input. but it is slightly difficult for one who does the thoughtway of programming of regular CPU.

    Cant' speak about channel speed but from my end it looks like the internal communication is wacky fast and the external serial links at 80Mb/s or 160Mb/s is huge. More for the parallel connections. The device I want to connect external channels to can only shift bits in and out at 5Mb/s.

It is a fact that the specifications of the channel are thought to be very high. The XMOS channel is much faster than "ordinary OS" ,when they use strict synchronism between the threads.

But XMOS channel is not "The hole to which you can push data whenever you need to." Asynchronous streaming channel has a buffer only 8 bytes(2 words) at the time of the present. Therefore you must write an additional code when it tries to give asynchronous data delivery any further.

Because I had tried to use the channel without knowing the above, I was not able to put out the performance.

...

Postby mio » Sun Dec 13, 2009 3:59 am >Out of interest what throughput were you aiming for ?

I wanted to get and buffer data from parallel input 8bit ADC as fast as XMOS chip can.

English is more difficult for me than xc :mrgreen: Please let me write the example by the program.

At First, I wrote below (This is partially or pseudo code).

Code: Select all in buffered port:32 RXD8 = XS1_PORT_8A ; Data From ADC out port ASYNCCLK = XS1_PORT_1K ;

int main() { streaming chan c ; par { sample_from_adc(c) ; comsumer(c); } }

Sample From ADC void sample_from_adc(streaming chanend c,chanend trigger) { unsigned d , tmp ;

     clock_setting() ; // set clock provide to ADC
      while(1) {
                trigger :> tmp ; // WAIT SAMPLE START
                for (unsigned address=0;address<MAXDATAS;address++){
         // Sample 4 datas = 32bit (Bufferd input)
         RXD8 :> d ;
                   c <: d ;
         // ASYNCCLK : for speed constraint check.
                        It should be more than 1/8 (1/2*(32bit/8bit)) of samplerate.
         ASYNCCLK <: (address & 0x01) ;
                }
                trigger <: ENDOFSAMPLE_MESSAGE ;
      }}

Comsumer example , save ram

define SIZE (0x4000 - 1) must be all 1 void consumer(streaming chanend c) { int address ; unsigned d ; while(1) { c :> d ; memory[address++ & SIZE] = d ; } }

But, this code is too slow, It could buffered data at most few mega sps. (I'm watching ASYNCCLK's rate with Oscilloscope).

So, I rewrote the code , below.

Code: Select all

pragma unsafe arrays void sample_from_adc(chanend trigger) { unsigned d , tmp ;

     clock_setting() ; // set clock provide to ADC
      while(1) {
        trigger :> tmp ;
   for (unsigned address=0;address<MAXDATAS;address++){
      // Sample 4 datas = 32bit (Bufferd input)
      RXD8 :> buffer[address] ;
      // ASYNCCLK : for speed constraint check.
      //            It should be more than 1/8 (1/2*(32bit/8bit)) of samplerate.
      ASYNCCLK <: (address & 0x01) ;
   }
        trigger <: ENDOFSAMPLE_MESSAGE ;
    }}

Without using channel, I can get data over 20Msps (ASYNCCLK is constant). So, I don't use channel for fast data aquisition. :cry:

...

" One of the interesting things he mentioned was a technique for identifying Americans - the stripes on their ties usually slope down from right to left. I always look out for it on TV. "

---