
The V850E1 and V850ES CPUs achieve high performance and higher code efciency through the implementation of the following
improvements to the V850 CPU.
•
Shift to 3-operand manipulations in 1 slot
IF
WB
br/sld
Pipeline
ID
WB
Master Pipeline
(V850 CPU compatible)
Async WB Pipeline
Address
calculation stage
Load, store buffer
(1 stage each)
ID DF
MEM
EX
IF (Instruction fetch): Fetches instructions and increments the fetch pointer.
ID (Instruction decode): Decodes instructions, creates immediate data,
and reads registers.
EX (ALU, multiplier, barrel shifter execution): Executes decoded instructions.
MEM (Memory access): Accesses memory of corresponding addresses.
WB (Writeback): Writes execution results to registers.
DF (Data fetch): Transfers execution data to WB stage.
•
Pipeline configuration
WB
Load
instruction
EX
V850E1 CPU
Next
instruction
ADD
instruction
Load
instruction
Next
instruction
ADD
instruction
MEM (external memory)
T1 T2 T3
Pipeline is stopped until MEM stage complete
Effective pipeline processing that uses the Async WB Pipeline when
appropriate, according to the instruction.
IF
ID EX
Conventional (V850 CPU)
WBMEMIF ID
WB(MEM)EXIDIF
WB
WB
DF
MEM
EXIF ID
T1 T2
MEM (external memory)
WB
IF EXID
IF ID EX
•
Non-blocking load/store
Conventional (V850 CPU)
EX
Branch
instruction
Branch
instruction
Branch destination determined in EX stage
MEM WB
V850E1 CPU
Branch destination
instruction
Branch destination
instruction
MEM
IF ID
IF ID EX WB
MEM
MEM WBIF ID
IF ID EX WB
Branch destination determined in ID stage
1-clock-cycle reduction
•
Parallel instruction execution (when executed by internal ROM)
•
Addition of branch/load pipes
•
Pipeline operation with branch instruction
ADD instruction
(16-bit length)
V850E1 CPU
Next instruction
Branch instruction
(16-bit length)
ADD instruction
Next instruction
Branch instruction
WBMEM
WBMEM
WBMEM
IF ID EX WB(MEM)
ID EX
Conventional (V850 CPU)
IF ID EX MEM
IF ID EX WBDF
ID
IF ID EX
2-clock-cycle reduction
Conventional
(V850 CPU)
V850E1 CPU
add r22(src2), r20(src1), r21(dst)
mov r20(src1), r21(dst)
add r22(src2), r21(dst)
• Sequence from mov to arithmetic
instruction is detected in the ID
stage, and if dst is the same, the
next manipulation is performed.
src1: Replace with src2 of mov
src2: src2 of arithmetic instruction
dst: As is
• mov + add instructions executable in
1 clock cycle
• Improved bus use efciency
• Shorter interrupt insensitivity period
• 2-clock branching
• Parallel execution of instructions
• Improved absolute performance
• Example: Synchronous processing
of mov + add
• Improved code efciency
• 10 to 15% improvement in object
efciency when C compiler used
Non-blocking load/store
Addition of branch/load pipes
Shift to 3-operand manipulations in 1 slot
Addition of high-level language-compatible instructions
* The next branch instruction code is also fetched due to the internal 32-bit bus.
•
Addition of high-level language compatible instructions
The V850E1 and V850ES CPUs have enhanced the
instruction set of the V850 CPU as follows.
V850E1, V850ES Architecture
switch (2 bytes)
• C language switch statement processing
converted into instruction
callt (2 bytes)/ctret (4 bytes)
• Table-reference branching
• Reducing size of call code that frequently
appears
Data conversion instructions (2 bytes)
• char, short type cast executed using
1 instruction
• sxh, sxb, zxb, and zxh instructions
prepare/dispose (4 bytes)
• Function start/end processing
executed using 1 instruction
unsigned Load
• Reduction of unsigned manipulation
code
mov imm32, reg (6 bytes/2 clock
cycles)
• Reduction of address setting code
mul/mulu (4 bytes)
• Reduction of array address calculation
• Improvement of sum-of-products
performance
Other
• Bit manipulation (register indirect bit
specication)
• cmov (conditional move), divide (div/
divu/divhu)
• sasf, endian conversion
......................................
............................
......
.................
.................................................
........................................ ....................
......
.................
............................
......................................
.................................................
........................................ ....................
Instruction 2
completion
Instruction 4
completion
Instruction 1
Instruction 6
completion
Instruction 3
Instruction 8
completion
Instruction 5
Instruction 10
completion
Instruction 7
Instruction 12
completion
Instruction 9
completion
Instruction 11
completion
Instructions executed in each clock cycle
Instruction 1
Instruction 2
Instruction 3
Instruction 4
Instruction 5
Instruction 6
Instruction 7
Instruction 8
Instruction 9
Instruction 10
Instruction 11
Instruction 12
Time ow
Internal system clock
Processing simultaneously
performed by CPU
<1> <2> <3> <4> <5> <6> <7> <8> <9> <10> <11> <12>
IF DP
ID
EX
WB
EX AT
WB
DF
ID
IF DP
ID
EX
WB
EX AT
WB
DF
ID
IF DP
ID
EX
WB
EX AT
WB
DF
ID
IF DP
ID
EX
WB
EX AT
WB
DF
ID
IF DP
ID
EX
WB
EX AT
WB
DF
ID
IF DP
ID
EX
WB
EX AT
WB
DF
ID
V850E2, V850E2M CPU features
V850E2, V850E2M CPU main added functions
V850E2, V850E2M CPU pipeline configuration
Execution of up to 2 instructions/clock possible (dependent on instruction sequence)
V850E2, V850E2M CPU pipeline operation
Register le
Instruction memory, instruction cache
Instruction execution
pipeline left (L-pipe)
BSFT
unit
ALU
unit
ALU
unit
MEM
unit
MUL
unit
Data memory, data cache
Instruction fetch pipeline (F-pipe)
Instruction fetch unit (B-pipe)
Instruction buffer
Dispatch unit
Instruction execution
pipeline right (R-pipe)
Instruction
decode unit R
Instruction
decode unit L
Write back unit
2 instructions can be executed by simultaneously using 2 instruction execution units
V850E2, V850E2M Architecture
V850E2M high-performance CPU core: 512 MIPS @ 200 MHz
Improved internal architecture for performance 1.6 times that the
of the E1 and 1.2 that of the E2
Backward instruction compatibility with V850E1, V850ES
and V850E2 CPUs at object level
7-stage pipeline
- Execution cycle optimization (V850E2M)
Eliminates ag hazards and speeds up conditional branching.
Improved interrupt functions
Processor protection functions (V850E2M)
- System register protection
- Memory protection
- Peripheral device protection
- Timing monitoring
The above four functions detect or inhibit illegal use of system
resources and improper monopolization of CPU execution time.
Support of expanding application software sizes
- Address space (program/data) expansion
- Strengthened cache memory support
High-speed division instructions (V850E2M)
- Variable-step division instructions added for high-speed
calculation.
Single-precision and double-precision oating-point
instructions (V850E2M)
- Compliant with IEEE 754-1985
32-bit relative branch instruction
- Support of program space expansion
- Long-distance branching performance, elimination of code
efciency losses
3-operand instructions (addition of target operations)
- Higher speed processing of operations such as multiplex add/
subtract (64-bit operation, saturate operation) and bit shift.
Sum-of-products instruction
- Higher speed 32-bit sum-of-products operation
(32 × 32 + 64
→
64 bits)
Bit search instruction
- Bit row change point search for run length measurement,
contributing to increased speed of conversion from integers to
oating-point values, etc.
300
400
500
600
323MIPS
*1
V850E1
150MHz
V850E2
200MHz
V850E2M
200MHz
432MIPS
*1
*1 Dhrystone1.1
*2 Dhrystone2.1
512MIPS
*2
8 levels 8 levels 16 levels
V850E1 V850E2 V850E2M
Priority
Channels
117 117 256
IF: Instruction fetch
DP: Dispatch
ID: Instruction decode
EX: Instruction execution
AT: Address transfer
DF: Data fetch
WB: Writing execution result to register
30 31
Comentarios a estos manuales