Caching documentation

Similar to https://github.com/cubist38/mlx-openai-server/issues/183, it would be helpful to have a document that describes mlx-openai-server's caching implementation. Is it LRU? Is there a limit? How do you configure it? Are there different caching strategies you can employ or is there only one?