Issue in Enabling L2 Cache of PXA320

I am working on windows CE 6.0 BSP porting based on the PXA320 zylonite BSP.
I am facing a crucial problem with enabling the L2 cache. Windows CE cache
flushing operations has been done through the void OEMCacheRangeFlush(LPVOID
pAddr, DWORD dwLength, DWORD dwFlags) function. Based on the data size
(dwLength parameter) the selection of the operation will be decided. If the
dwLength is greater than equal to the cache, the entire cache will be flushed
otherwise only few cache lines will be flushed.
During the write back and invalidate operation (case CACHE_SYNC_DISCARD),
flushing of the entire 256K L2 cache is perfectly working. This case will be
used when the data size is greater than or equal to 256K. For lesser data,
line flushing operation is implemented. During this case address (pAddr) is
used to find the corresponding cache lines. L2 cache is accessed using the
physical address. Using the page table this MVA is converted to physical
address. During this operation sometimes page fault is occurred. Because of
this data abort is occurred. I have done a change in the code to fix the
issue. Instead of using the line flushing, I am flushing the entire L2 cache.
Now it is working fine. But this is not a good practice. Instead of flushing
few lines I am flushing the entire cache. In this case I can’t achieve the
full performance.
The same operation is implemented in L1 cache also. There is no such problem
in accessing the L1 cache. I have seen the xscale L2 cache implementation
application note. I didn’t find any mistakes in the implementation in
zylonite BSP.
I want to know the reason of the issue. Is it the windows CE 6.0 issue or
BSP issue? Please advise me.
Advanced thanks,
R.Vinoth (MCTS wince)