[Logic] Processors Design and Conception

    Do you want to see the other older processors ? :)

    • Yes ! put some pictures and specs !

      Votes: 2 28.6%
    • No I don't care !

      Votes: 1 14.3%
    • Yes , pictures AND files with some documentation :D

      Votes: 4 57.1%

    • Total voters
      7

    takethispie

    Titan-class builder
    Joined
    Oct 3, 2012
    Messages
    239
    Reaction score
    103
    • Purchased!
    • Wired for Logic
    • Legacy Citizen 9
    hi, i'm here to share one of my passion and hobby: building processors
    for those who aren't familiar with this terms, the processor or Central Processing Unit ( CPU )
    is the "brain" of your computer, in the center of the motherboard of your computer
    without a CPU, well...it isn't a computer anymore xD

    I started to build processors a long time ago when I found a video of THE1LAZ with his redstone CPU (I was playing minecraft at the time) and tried to build one... it took me 3 months just to understand how it works
    and then I started to build one xD

    everything you are going to see now is made in Logisim a FREAKING AWESOME free logic simulator:


    SPECS:
    16bit instruction , 255 instruction program memory
    8bit data
    4 Gp registers
    ALU ( + , - , !A , A > B , A = B, A < B )
    255 * 8bit RAM
    Direct unconditional and conditional JUMP only

    it needs 8 instruction to increment a register (add 1 to the register's value), it was painfully slow

    the right part are peripherals like led, buttons and a little led matrix
    and the design was also poorly done but it worked ! 1st generation (my CPUs are classified by generation
    a new generation appear when a major improvement is made ^^
    current generation -> 6th (1-2 cpu per generation)



    TODAY:
    after more than 13-15 processors and a lot LOT of improvement I started the Fazer processor:

    Fazer-S (Single cycle processor)

    32bit instruction with 16MB program memory
    15 GP register 32bit
    ALU ( + , - , * , / , !A , left logical shift, right logical shift , right arithmetic shift, random, XOR )
    Register conditional and unconditional JUMP
    conditional and unconditional BRANCH
    Direct mapped Cache with 2^18 entry

    32bit General Purpose IO with 32bit adressing (yes I can adress more than 2 billions peripherals xD)


    Fazer-P ( pipelined processor)

    SPECS:
    Pipelined 4 stage (the last stage is not pipelined because I'm new to Cache memory )
    32bit instruction with 16MB program memory
    15 GP register 32bit
    ALU ( + , - , * , / , !A , left logical shift, right logical shift , right arithmetic shift, random, XOR )
    Register conditional and unconditional JUMP
    conditional and unconditional BRANCH
    Direct mapped Cache with 2^18 entry

    32bit General Purpose IO with 32bit adressing

    it needs 4 cycle to increment a register BUT it's not a good point of comparison as the Foxtrot16 was not pipelined and this one, in fact the Fazer-P is like 10 times faster and more powerful than the Foxtrot ;)

    it's my first pipelined processor and it's by far more difficult to build than a single cycle CPU
    I need to figure out how to add interrupts, it must be hardcore to do >_<

    it has the exact same features as the Fazer-S, in a simulated environement you can't see the difference between a single cycle and a pipelined processor but in the real world pipelined processor can be VERY VERY fast when single cycle are limited by the fact that the signals must propagate through the whole processor so the clock must be slower ^^



    "but what's the use of a simulated processor ? you can't do anything interesting !"

    FALSE !
    logisim has a lot of input/output peripherals like buttons, led, matrix and even a joystick and a keyboard !
    also Cornell university created a library of useful component like a 128*128 graphical LCD
    that I will be using with the FAZER-S to make funny things like loading an image, display all sort of things


    you can see the greeny dot on the LCD, I was testing the LCD driver chip, you can't see the instruction on the ram because it's an exported image and not a screenshot but I only need
    4 instruction to draw a dot (5 instruction when I access the LCD for the first time)



    next step is to add interrupts (esy to do on a single cycle CPU) and an MMU (Memory Management Unit) to add Virtual memory and paging, wich is essential to make... an operating system ;)




    another Funny thing: I made an IDE (Integrated Developement Environement) to write programs
    with an integrated dynamic assembler to write code easily, it's still in development :


    features:
    basic syntax highlightning
    debugger (detect argument lenght error, incorrect characters, syntax error, etc)
    project management -> include, ISA and the main program

    but the most important: dynamic assembler
    what's that ? well it's an assembler where you can change the instruction as you wish, in fact it works with an ISA file (Instruction Set Architecture) wich contains all instructions and their hexadecimal equivalent
    you can load an ISA on the fly and compile the program to work for another processor (but you need to keep the same ASM name ! )

    the ISA file contain this:

    ADD 01cFab00

    a and b are arguments
    (F isn't recognized as an argument because it's in uppercase)
    so in the IDE this instruction will work like this:

    add 1 2 3
    will be converted to -> 013f1200
    the keyword (add) aren't case sensitive so add, Add, ADD has the same meaning to the compiler

    the major drawback with this compiler is the fact that you can't use argument that are not power of 4 in length, that's why I have only 15 registers in my processors (4bit -> 0 to 15)
    anyways every processor I've made have instruction splitted in 4-8-16-24bit part ^^


    Hope you like it and if you have questions don't hesitate, I would love to awnser because, not many
    people love that kind of stuff xD
     
    Last edited:

    Jaaskinal

    ¯\_(ツ)_/¯
    Joined
    Jan 19, 2014
    Messages
    1,377
    Reaction score
    646
    • Legacy Citizen 4
    • Wired for Logic Gold
    • Thinking Positive
    I'm just starting to get into the very basics of this type of stuff :P
    Right now this is my most complex thing http://starmadedock.net/content/logic-counter.2143/
    It kinda blows minds of people who don't know a ton about logic, but people who love logic know that it's really not that huge :P
    Also, if you aren't already, please consider uploading content to community content, logic seems to always fall behind every other catagory, and we need more logic xD
     

    takethispie

    Titan-class builder
    Joined
    Oct 3, 2012
    Messages
    239
    Reaction score
    103
    • Purchased!
    • Wired for Logic
    • Legacy Citizen 9
    Jaaskinal yeah I can see that, still it's good to start from the basic ^^
    usually people don't even understand what I'm doing xD
    very cool ! but you seems to use a tremendous amount of gates, are your 7 segment displaying in decimal or hexadecimal ?
    what kind of design are you using ? adder + register ? :)

    well I've done a working CPU in logisim:






    you can only put one instruction by one because I'm still having bugs with the programm counter and I need to build a Rom, but it's not my priority I'm more into Logisim and FPGAs ><
     

    Jaaskinal

    ¯\_(ツ)_/¯
    Joined
    Jan 19, 2014
    Messages
    1,377
    Reaction score
    646
    • Legacy Citizen 4
    • Wired for Logic Gold
    • Thinking Positive
    adder? register? no clue; brute force xD
    Decimal :P
     

    takethispie

    Titan-class builder
    Joined
    Oct 3, 2012
    Messages
    239
    Reaction score
    103
    • Purchased!
    • Wired for Logic
    • Legacy Citizen 9
    ok xD
    you need to learn the sequential logic (RS NOR latch, D Flip Flip, T Flip Flop, multiplexer, adder, etc)
    and really you shouldn't use decimal display, use hexadecimal display instead or just binary display
    with the adder + register it should take less than 200 gate to have a 8bit counter (0 to 255) ^^'
     

    takethispie

    Titan-class builder
    Joined
    Oct 3, 2012
    Messages
    239
    Reaction score
    103
    • Purchased!
    • Wired for Logic
    • Legacy Citizen 9
    update: I've built a RAM chip :D
    it only has about 270 MB because I'm just too lazy to build the 255 modules xD


    each module is chained to the next wich allow me to don't use multiplexer to select a module, the lower 24bit of an adress correspond to the adress itself and the 8 upper bit select the module. so I can have 4GB like any 32bit computer :D

    let's see how a memory module looks like inside:


    I also built the subroutine circuit to make hardware function, so I can now use compiler function (simple duplication of the code) and
    hardware function :)

    next step is the interrupt controller \o/
     

    takethispie

    Titan-class builder
    Joined
    Oct 3, 2012
    Messages
    239
    Reaction score
    103
    • Purchased!
    • Wired for Logic
    • Legacy Citizen 9
    update: here comes the interrupt controller :D

    image of what does the Fazer-S looks like now (screenshots and not exported image option from logisim):


    the interrupt controller circuit:


    I'm going to create some programs that I will provide with the files and documentation :)
     

    takethispie

    Titan-class builder
    Joined
    Oct 3, 2012
    Messages
    239
    Reaction score
    103
    • Purchased!
    • Wired for Logic
    • Legacy Citizen 9
    update: I added a new feature to the register bank:
    it can save all register or load saved register (loading saved register will overwrite the current registers)

    inside the register bank


    the register bank inside the CPU:

    saving register is pretty useful to avoid data corruption during interrupts but it can be used for different reason too

    I've also rebuilt the cache memory because it was not working correctly, the new design is similar thought:


    I've also updated a bit the IDE and compiler, added a project manager to work with different isa and files, including library, etc
    here is some code, actually to set a pixel to white:


    and the program executed on the cpu (you can see the little white pixeel ^^ ):



    also the documentation is in the making and I will provide the Hynix, an older processor from the previous generation for those who are interested :)
     
    Joined
    Mar 11, 2015
    Messages
    141
    Reaction score
    39
    • Community Content - Bronze 1
    • Purchased!
    takethispie
    To your post in the Logic-Tutorial
    Yeah i built a processor, and I'm trying to find solutions to avoid the loading problem.
    http://starmadedock.net/threads/chunk-loader-sector-loader.6415/

    I have already found a temporary fix for me, to avoid "memory" corruption.
    Because the logic is loaded in the order it was placed, I'm able to check every module if its loaded, by at first setting up the control logic. Its a simple clocked signal going through an extra activationblock per modul and when the clocksignal is correct on the end, i know all is loaded.
    Therefore you also have to build your instruction-bus at first, so this is loaded at first and you can feed the sytsem with NOPs until your loading-watchdog says all ok and running.

    EDIT:
    This is not working all the time, but it reduced errors to less than 10% :)
     

    takethispie

    Titan-class builder
    Joined
    Oct 3, 2012
    Messages
    239
    Reaction score
    103
    • Purchased!
    • Wired for Logic
    • Legacy Citizen 9
    loading isn't the only problem, all finite state machine and sequential logic has unpredictable behavior sometime and/or don't act as they should because of how logic is handled in starmade.
    when I have a loading bug I can't activate any logic gate or activator for a few seconds, logic is basically frozen xD
    but thank you it's a good idea
    if you like building processors you should try Logisim :p
     

    takethispie

    Titan-class builder
    Joined
    Oct 3, 2012
    Messages
    239
    Reaction score
    103
    • Purchased!
    • Wired for Logic
    • Legacy Citizen 9
    update:
    I finished the most important part of the documentation, and created a devkit:
    - Fazer-S 32bit MIPS RISC processor, so yes it's the last I've made !
    (some secondary part are missing like the computer port allowing interrupts)
    with his Documentation in PDF

    - the TKAS assembler, in console unlike CPUcompil, it can work only with a single ISA description file at a time, isa.txt.
    it's easy to use: asm <yourSourcePath> <yourDestinationPath>
    Ex: asm test prgm <- only if the TKAS.exe and the files are in the same directory wich I recommend strongly to be faster
    to clear console: clear and to quit: quit , pretty obvious xD

    - Logisim 2.7.1

    Fazer_computer is the file to open -> right click on the memory in the upper left -> Load image ans then select your file
    choose your frequency in simulation -> Tick Frequency (don't forget that if you're using RAM access you can't go faster than 2kHz)
    to launch/start the clock: CTRL + K , step by step: CTRL + T

    everything you need to create program and test them !
    of course the Fazer-S is missing a few things, the documentation too but I will update both regularly :)

    https://mega.co.nz/#!bF4RnZRZ!k6mMyXtB8in3R7p7G8ptPJSgLf53Z7es7i8FF_fc4WA
     

    NeonSturm

    StormMaker
    Joined
    Dec 31, 2013
    Messages
    5,110
    Reaction score
    617
    • Wired for Logic
    • Thinking Positive
    • Legacy Citizen 5
    it can save all register or load saved register (loading saved register will overwrite the current registers)

    1. You should not save ALL registers.
    • Instead of the calling function saving all registers, the called function should save just these registers itself writes to - that saves stack memory.


    2. I would mirror the registers (( for a background-EEC + roll-back (if needed) )) with 2 equal circuits
    • Mirroring could also provide the possibility to execute jump commands and stack-saves/restores in parallel to boost function calls/returns.
    • This could diminish the time-disadvantage of Point 1.
    3. + add the option to split it into 2 CPU-Cores without EEC.


    4. Perhaps you can implement hardware-side support for multiple return values in functions.


    5. Your RAM could also have a feature to follow links and return this value instead
    • light speed has a very limited distance of <10cm/cycle at 3 GHz
    • Don't ping-pong values between RAM+CPU if not neccessary.

    6. Cache-Memory has a time-stamp next to each value or a similar mechanism to check which values are the oldest and should be dropped.
    • You can (ab)use the comparator logic to implement Single-Instruction-Multiple-Data commands with few additional logic.
    • Perhaps a lookup-table which can also be used to increase first value by second (position + velocity for many objects, etc)

    7. Variable Cache Modules could enhance the size of Stack/Heap/Cache dynamically to fit different programs with fewer total circuitry.
     

    takethispie

    Titan-class builder
    Joined
    Oct 3, 2012
    Messages
    239
    Reaction score
    103
    • Purchased!
    • Wired for Logic
    • Legacy Citizen 9
    1. You should not save ALL registers.
    • Instead of the calling function saving all registers, the called function should save just these registers itself writes to - that saves stack memory.
    well I kinda removed the save/load register fonctionnality when I've rebuilt the register file, also, it doesn't use stack
    it would take a few dozen nanoseconds to save (it takes the time to go from one register to another actually)

    5. Your RAM could also have a feature to follow links and return this value instead
    • light speed has a very limited distance of <10cm/cycle at 3 GHz
    • Don't ping-pong values between RAM+CPU if not neccessary.
    I know, those rules don't apply in logisim where the maximum frequency is 4.1kHz xD

    4. Perhaps you can implement hardware-side support for multiple return values in functions.
    there is 15 general purpose register you can use most of them for multiple return values :)

    6. Cache-Memory has a time-stamp next to each value or a similar mechanism to check which values are the oldest and should be dropped.
    • You can (ab)use the comparator logic to implement Single-Instruction-Multiple-Data commands with few additional logic.
    • Perhaps a lookup-table which can also be used to increase first value by second (position + velocity for many objects, etc)

    7. Variable Cache Modules could enhance the size of Stack/Heap/Cache dynamically to fit different programs with fewer total circuitry.
    no, implementing parallelism isn't just a few additional logic you just put there, the Fazer CPUs won't have this , maybe in future version
    but it's not a priority at all !

    it's not a X86 CPU Stack and Heap don't exist ^^

    Cache size can't be changed , even in real life by the way
     

    NeonSturm

    StormMaker
    Joined
    Dec 31, 2013
    Messages
    5,110
    Reaction score
    617
    • Wired for Logic
    • Thinking Positive
    • Legacy Citizen 5
    there is 15 general purpose register you can use most of them for multiple return values :)
    You still rely on a hell-load of register-copy commands? :D It was just some random idea ;)

    it's not a X86 CPU Stack and Heap don't exist ^^

    Cache size can't be changed , even in real life by the way
    IRL some Caches reserve cache cells for 16 or 32 continuous bits even if just 8 bit are used
    • Because you save comparator-logic-elements for 1/2 to 3/4 of cache bits.

    But I never heard of the improvement to decide which cache-chip is used dependent on the number of continuous bits/bytes of requested data.

    Would you not want to be the first one implement these things?
     

    takethispie

    Titan-class builder
    Joined
    Oct 3, 2012
    Messages
    239
    Reaction score
    103
    • Purchased!
    • Wired for Logic
    • Legacy Citizen 9
    You still rely on a hell-load of register-copy commands? :D It was just some random idea ;)
    yes but it's not copy , results are directly stored in the registers , wich takes only one clock cycle you know ^^


    IRL some Caches reserve cache cells for 16 or 32 continuous bits even if just 8 bit are used
    • Because you save comparator-logic-elements for 1/2 to 3/4 of cache bits.
    each cache's line are divised in 4 blocks of 32bit (in the X86 architecture for example) even if it's direct mapped, set associative or fully associative cache

    But I never heard of the improvement to decide which cache-chip is used dependent on the number of continuous bits/bytes of requested data.
    Would you not want to be the first one implement these things?
    I don't understand what you are saying, you always "request" the same amount of data wich is 32bit in most CPU

    p.s: I'm talking about the DataCache not L2 or L3 cache