Logic Cortex -Programmable Microprocessor (WIP)

takethispie · Jul 2, 2016

Trekkerjoe hey ! ( you remenber me ? the guy with the CPU on steam artwork )
when you say you take all 4 cycles to execute an instruction so first 4 cycle -> first instruction, next 4 cycles -> second instruction ?

with a 1Hz clock, I have a throughput of 1 instruction / seconds on my CPU (except on cold start where it need to fill the pipeline )
so you should definitly add pipelining but this would need a good revamp of your current architecture, also the fact that you are using very complex components for simple operation, kinda a crossbreed between accumulator based architecture and register based architecture is the reason why the more you work on it the harder it is to work on it I think

Trekkerjoe · Jul 2, 2016

takethispie Yeah, that thing was pretty awesome looking.

I have considered pipelining, so that I could try to get it to perform multiple operations at once. Yeah, the architecture I am using does have some quirks. The point of it was to keep the thing compact enough to put in a ship, or a robot. Yeah, that complicates matters a bit. I had recently simplified the CU a bit to lean a little closer to an accumulator setup, relying on the internal RAM more than before. Now it is a lot easier to set up the commands, as I don't have to break the rules as often.

The entire system runs off of a 4hz timer, allowing me to perform the four smaller instructions that it takes to perform one operation a second. The cycle looks like this: Iterate Counter>Start Operation>Store Result>Advance Counter. All of this usually happens in one second, but a few of the commands(especially the memory related ones that can suspend the timer) can take two seconds. It may seem strange that I iterate before everything else, but by doing this, I can feed an iterate command into the program counter without advancing it first, which potentially makes the conditional logic laughably simple.

My next attempt on this, I think I am going to try going for a pipeline architecture, and I will not limit myself on size this time around.

takethispie · Jul 3, 2016

ok so kinda pipelining but without the registers between each stages, >1 second operation are really a big bottleneck though :s
the program counter increment in first is not really strange, I'm doing it too in all my processors

for a pipeline based architecture you should use an MIPS architecture (same as mine)

Trekkerjoe · Jul 3, 2016

Thanks for the tips. I wonder though, how does one handle switching logic within a pipeline? I would imagine you need safeguards to prevent you from running code that would normally be bypassed. I have looked up the subject, and it gave me a few ideas. The more I read it, the more hazardous my system looks. I've had to do far too many simplifications on the hardware, but I am so close I can taste it.

Also, all your processors? You mean there are more? Impressive.

takethispie · Jul 3, 2016

handling logic is just a bit more complicated than doing a single cycle processor, a forwarding unit is needed to prevent Read After Write or Write After Read errors.
example: in a 5 stage MIPS CPU (Instruction Fetch, Instruction Decode, EXecute, MEmory access, Write Back),
say instruction 1 is in EX stage adding value of $3 to 5 the result will be stored in register $4 and instruction 2 in ID stage wich will add reg $4 and $5 the thing is that instruction 2 won't use the correct value as the result of instruction 1 is not written back to the register file already
to resolve that issue the forward unit will forward the result to instruction 2
the forwarding unit can also forward from ME stage or even WB stage where IO operations Occur (in my architecture)

I have only built one more, made before the one on steam to test if logic was stable enough, a 4bit computer (4bit data, 16bit instruction length ) but about 12-15 on Logisim ^^
actually my 8bit processor is an heavily downgraded version of a processor I called DustCat wich is a 32bit CPU with 3 stages pipeline, interrupts and 32bit adressable 32bit IO bus (there is a post about somewxhere in starmade forums xD )

GalactusX · Jul 8, 2016

one word ... awesome ... I've been working to do the same as you, but i only build different parts of a microprocessor separately, when finished i will put them all together.

Check out my logical creations, maybe they might give you some help or idea with something

KiloZulu · Jul 8, 2016

I can make doors open and close.

Every once in a while a light flashes.

Chaosinflesh · Jul 10, 2016

Late to the party...

Nice. Regarding pipelining, a pipelined processor has a control logic circuit. You don't need to pause instructions, you merely need to have a way to handle if you get it wrong. Depending on the depth of your pipeline, you may have to flush the last instruction or two. There's no need to shadow registers if you take this approach, however, if you wanted (and your memory architecture supported it), you could try to store/process both branches and choose which was correct later in the pipeline.

Another option is to use 'NOP' instructions after the branch, meaning nothing gets wasted regardless of the chosen decision, and you can still have simple IR logic.

You could also go for the simplest branch predictor - always taken.

Anyway, nice work. Looking forward to seeing the next iteration.

Trekkerjoe · Jul 12, 2016

Thanks for all the encouragement! I had taken a break from all that logic. The CU was so messy it was nearly impossible to debug, and as a result, I started calling it the 'switchbox.'

However, today I had managed to debug all of the microcode, including the branching logic. Instead of injecting a signal into the program counter, I wound up making it run the clock without updating the first cycle, allowing me to skip a line of code at the cost of a slight pause.

The only thing left to do is interface the bit shifter and move the parts into the correct place.

The next version will probably be more splayed out, and hopefully will be based on a proper pipeline, not the weird delayed signal propagation I am currently using.

Fellow Starmadian · Jul 13, 2016

I think it's worth mentioning that if you use wireless logic to connect the memory cells to the cortex or whatever, it will vastly improve their response times. you'll have to make a buffer though, as I did for my vaygr ascii display board. it stores 128bits quite nicely, with no (practically none) lag on the display. I've found out that you can multi-thread the logic calculations this way, probably doing terrible things to the server on a large scale, but vastly improving performance. I've also found that just attaching large quantities of logic to a large object creates a large amount of lag, so for that project I docked all the logic to their station, instead of just pasting it on. ( I like the word large).

Also, I really need to look into this stuff, seems pretty powerful for lots of different applications.

Trekkerjoe · Jul 13, 2016

I am quite curious, I had started this project with the misconception that wireless logic would introduce a delay, not speed it up. So splitting the memory cells up into 16-bit cartridges will increase the speed? That is good news. The longer the Cortex runs any program, the more the entire mechanism chugs. Looks like for my next build I should invent the starmade equivalent of a motherboard.

As for buffer, I assume it is to take into account any delay or desync between the circuits.

Fellow Starmadian · Jul 13, 2016

not the speed, so much as reduce lag. And now that you mention delays, I remember that the buffer collected data from the memory, while displaying the data that was called for 4 ticks ago, so that is quite a big delay actually... I think that the delay can be gotten rid of, since all I wanted was to remove the desync issues due to the wireless logic. Maybe with some research into removing the delay, the buffer can solve some of your logic woes?
[doublepost=1468380845,1468379537][/doublepost]Just realized it might be easier to see the circuit if it was downloadable.. I'll have to post it tomorrow, too late atm for the hassle lol.

Trekkerjoe · Jul 13, 2016

(Up far too early. Why you do this to me Windows?)

Perhaps using a timer to split the signals between motherboard and the docked logic between a StartOperation stage and a ReadOperation stage would fix the instability. The cortex uses a ripple-carry adder, which at first was notorious for it's junk noise signals. You can't just send data in and get the correct values in the same cycle, however, if you give the circuit a chance to settle before reading off of it, you can bypass the instability. Hoping that this is true for wireless logic as well.

Fellow Starmadian · Jul 13, 2016

I've got to ask, what kind of experience do you have in electronics that allows you to just build a cpu?

Also, I realized today that my marquee board is up for grabs on the NFD build server. You reminded me that I need to post it in the community content section, it was finished about a month ago.

Trekkerjoe · Jul 13, 2016

Ah, I actually went into this with no more than a rudimentary knowledge of solid-state electronics, the tricks I've picked up from working with logic in-game,Wikipedia, and some books I have lying around. I am a novice programmer as well, so I at least have a vague idea of what I am doing, code-wise.

My inexperience with scratch building computers definitely shows with this thing, but it was an educational experience for sure. When I get it fully functional, I will be able to start on my next one with a much, much better idea of what I'm doing.

GalactusX · Jul 14, 2016

Trekkerjoe said:
Ah, I actually went into this with no more than a rudimentary knowledge of solid-state electronics, the tricks I've picked up from working with logic in-game,Wikipedia, and some books I have lying around. I am a novice programmer as well, so I at least have a vague idea of what I am doing, code-wise.

My inexperience with scratch building computers definitely shows with this thing, but it was an educational experience for sure. When I get it fully functional, I will be able to start on my next one with a much, much better idea of what I'm doing.

In my personal case, i have experience on some programs languages and some robotic systems, and i must to say, is very diferent "build your own hardware" and save some simple programs (software) on it, than use a existance "hardware" and make your own software.

I, same as you, i try to build diferent logic system to show people how far you can go using only basic gates and how it can help you in some cases for educational intention, if you need any help, just ask what exactly you need to build

walle_222 · Jul 14, 2016

You might want to be careful when you try to port your cpu to another server. I dived into this stuff a while ago and noticed some interesting errors in ripple carry addition systems, which was the same thing i used in my program counter. Eventually i had to add a one second delay to filter out the errors coming from the adder in the pc so the custom flip flop memory i built for it wouldn't malfunction ( it was a single 8 bit register that reset itself on the trailing edge, have fun with trying to figure that out ). Also, the logic in this game is event driven, so you can desync signals somewhat and do some weird stuff with flip flops, but it must be filtered out with a delay block before it is stored and useable, otherwise it will completely screw everything up.

Someday in the near future when i finally get a day off again and time to sit down and play this game again, i'd love to take apart your cpu and learn the pipelining architecture it uses. Something that i've never been able to learn how to do.

Trekkerjoe · Jul 14, 2016

I have finally finished the Logic Cortex! Hopefully. Barring any problems. After hooking up the bit shifter, I moved the upper stack to be above the CU, where it was intended.
Pics:

The power button is obvious, the red one is the reset. To the right of those are the I/O pins.

It is an absolute mess of wiring.

All thats left to do is test it out in some kind of application. I think a robotic arm, and super simple CV is in order. After that, if there are no more problems, I will post it on CC.

walle_222 said:
You might want to be careful when you try to port your cpu to another server. I dived into this stuff a while ago and noticed some interesting errors in ripple carry addition systems, which was the same thing i used in my program counter. Eventually i had to add a one second delay to filter out the errors coming from the adder in the pc so the custom flip flop memory i built for it wouldn't malfunction ( it was a single 8 bit register that reset itself on the trailing edge, have fun with trying to figure that out ). Also, the logic in this game is event driven, so you can desync signals somewhat and do some weird stuff with flip flops, but it must be filtered out with a delay block before it is stored and useable, otherwise it will completely screw everything up.

Someday in the near future when i finally get a day off again and time to sit down and play this game again, i'd love to take apart your cpu and learn the pipelining architecture it uses. Something that i've never been able to learn how to do.

This computer does not use a pipeline, sorta. Data is handled in stages, but instead of using registers, they overlap allowing data to propagate through the system. This is the architecture I had come up with when I was brainstorming a solution to the instability of components like the adder. The four stages are IterateCounter>Read/Execute>ReadALU/StoreAnswer>RefreshCounter. This means the Adder receives the data and is allowed a quarter-second delay to settle before it is read. Same for almost all the other components. If it was a true pipeline, it would probably have more stages, and they would all run simultaneously. For my next computer I will attempt to make a pipeline. Still, you are free to dissect it when I think it is ready to be released.

This build does make heavy use of instant pulses in the high-speed memory. They are perfect for clearing memory cells and writing to them at the same time.

Also, you mentioned using an adder in your program counter? If that is the normal way, then my setup is very, very weird. It uses a t-flip flop chain and a complicated method for overwriting the data (I am not even sure how I got it to work XD), which results in a massive penalty on the use of the GOTO command. I will replace this with an adder system for my next build.

EDIT:

GalactusX said:
In my personal case, i have experience on some programs languages and some robotic systems, and i must to say, is very diferent "build your own hardware" and save some simple programs (software) on it, than use a existance "hardware" and make your own software.

I, same as you, i try to build diferent logic system to show people how far you can go using only basic gates and how it can help you in some cases for educational intention, if you need any help, just ask what exactly you need to build

Wait, you are into robotics too? Awesome. This build is actually one of the stages in my plan to bring robotics into starmade.

I agree with you, it is far easier to piece together existing components than to wire wire everything yourself.

Thanks for the offer. I might take you up on it if I hit a dead end.

walle_222 · Jul 15, 2016

Trekkerjoe said:
I have finally finished the Logic Cortex! Hopefully. Barring any problems. After hooking up the bit shifter, I moved the upper stack to be above the CU, where it was intended.
Pics:

The power button is obvious, the red one is the reset. To the right of those are the I/O pins.

It is an absolute mess of wiring.

All thats left to do is test it out in some kind of application. I think a robotic arm, and super simple CV is in order. After that, if there are no more problems, I will post it on CC.

This computer does not use a pipeline, sorta. Data is handled in stages, but instead of using registers, they overlap allowing data to propagate through the system. This is the architecture I had come up with when I was brainstorming a solution to the instability of components like the adder. The four stages are IterateCounter>Read/Execute>ReadALU/StoreAnswer>RefreshCounter. This means the Adder receives the data and is allowed a quarter-second delay to settle before it is read. Same for almost all the other components. If it was a true pipeline, it would probably have more stages, and they would all run simultaneously. For my next computer I will attempt to make a pipeline. Still, you are free to dissect it when I think it is ready to be released.

This build does make heavy use of instant pulses in the high-speed memory. They are perfect for clearing memory cells and writing to them at the same time.

Also, you mentioned using an adder in your program counter? If that is the normal way, then my setup is very, very weird. It uses a t-flip flop chain and a complicated method for overwriting the data (I am not even sure how I got it to work XD), which results in a massive penalty on the use of the GOTO command. I will replace this with an adder system for my next build.

EDIT:

Wait, you are into robotics too? Awesome. This build is actually one of the stages in my plan to bring robotics into starmade.

I agree with you, it is far easier to piece together existing components than to wire wire everything yourself.

Thanks for the offer. I might take you up on it if I hit a dead end.

Instant memory? Interesting, not sure how you would have made that but then again its been a while since i played starmade. Goto's are actually very simple if you use a register and the adder in a loop with a short delay inbetween. The major problem though with the adder system is that it unfortunatly creates erroneous pulses which must be filtered out or they will corrupt data, ran into that a lot before, so the fact that you actually have a t flip flop system that works somehow is great and i would keep using it. What i would do to improve goto performance and make it instant on the next cycle is use your instant memory as the final step before the output, that way with a goto signal you can simply accept the goto signal when needed, else just accept the signal from the t flip flop chain.
In regards to reading the alu, if you dont time the instant memory read perfectly, your going to get the wrong data. In my tinkering with the adder system being used in a pc, the errors always propagated before the correct data was processed through. So it is in fact possible to get the correct data if you can find out a way to only get the last ( and correct ) stream of data sent in the cycle.

Also, my pc architecture wasn't exactly normal. In all my cpus there was never actually an instruction addressed at 0, so the way i constructed my program counter was the input was always directly from the adder straight into a highly customized flip flop register with a few safeguards ( if new data was being written to it it flushed the old data first on the same cycle and stored the new data at the same time ) with a 1 second delay inbetween to filter out errors from the adder. Then the data went straight out of the register to output and to the input of the adder. This architecture not only allowed an instantaneous goto on the next cycle but also allowed the pc to be incremented in greater amounts than 1 or to be locked on the same instruction to allow a loop to be created if the same instruction needs to be done multiple times in a row. I never did implement the increment function into my pc in starmade, but i did in a different version made in logisim specifically for that program ( because it's registers behave like real world registers ). Unfortunatly, i think the maximum clock for this setup is 1 pulse per second ( one hertz? Been a while...).
Apologies if this is a long post that may not make a whole lot of sense in certain areas.

Trekkerjoe · Jul 15, 2016

Sounds like I might want to use buffered version of a t-flip flop chain instead of an adder. The problem with that system is the fact that the T-flip-flops are unstable, and you can't easily write to them. The system I had would pause the timer, erase itself and disconnect the chain, have the new address written, reconnect the chain, a filtering signal that returns the flip-flops to the imprinted signal, and update the instant memory with the new signal. There are far too many steps, and it takes over three seconds when it's over. Quite a steep penalty for something with a throughput of one operation a second.

*Goes into brainstorm mode.*

I might be able to update them faster if I activate all of them with a controlled pulse, send the value I want to get an inverted signal, and toggle them again to invert it back to the correct signal. It may take three clock cycles or more. Too slow for the pipeline I have planned for the next computer (which should perform operations at 2Hz) so I will have two or three of them, one that is engaged with the cache, and the other(s) will already be in the inverted state. That way the delay is masked, allowing me to run it all at 2Hz.

EDIT: Perhaps with extra logic, I could have NOTs coming out of the T-flip-flops to temporarily invert the signal back, and allow me to update over multiple cycles without corrupting the data in the cache.

You were also wondering about my instant memory. My setup is fairly simple. I have an AND><OR register. There is an instant pulse generator going into a NOT, going into the ANDS. Sending a signal to the instant pulse generator and the overwriting signal to the register will update it instantly, in one clock cycle. Quite useful for RAM and cache memory. The alternative is to give the circuit logic the logic to change states only if the signal being received is different, but that is a whole other circuit.

Search

Logic Cortex -Programmable Microprocessor (WIP)

takethispie

Titan-class builder

Trekkerjoe

takethispie

Titan-class builder

Trekkerjoe

takethispie

Titan-class builder

GalactusX

KiloZulu

Chaosinflesh

Trekkerjoe

Fellow Starmadian

Oh cool so thats what this is

Trekkerjoe

Fellow Starmadian

Oh cool so thats what this is

Trekkerjoe

Fellow Starmadian

Oh cool so thats what this is

Trekkerjoe

GalactusX

walle_222

Trekkerjoe

walle_222

Trekkerjoe