If a simple pipelined processor is super-pipelined by a factor of 3 (the ALU takes 3 cycles instead of one for the smallest operation), back-to-back dependent instructions require ______ stall cycles to be inserted even with perfect bypassing.