I'm trying to understand containerized apps and databases and I'm trying to understand the microservice architecture using kubernetes. One thing that I couldn't get my mind convinced is the database part.
For example I have a Venues database and a Venues REST API application in a container, which also has the database inside the same container.
When the first instance crashes for some reason, kubernetes launches a second instance. Or, there may be more than one of the same service running in different instances. Or they might be connecting another DB service all together.
The thing that I couldn't understand, when the service which contains the DB crashes, the newly created container will have an empty database right? How is this handled? If replication is the case, what about memory and disk usages?
Please someone clarify this with different approaches. Or is there something previously asked here, or somewhere else, please direct me to the right direction.
Thanks in advance.
1 Answer 1
While a database server may run in a container, its storage needs to be an external resource. For example, when running Docker manually, you would mount a volume from the host into the Docker container for the server process to use.
That external storage survives the server process, whether that process is containerized or not. Restarting the process (but with the same storage) will allow the database to recover, if configured appropriately.
In a data center setting, the storage is not part of the same machine that the software runs on. Instead, you might have a dedicated disk array that is connected over a storage area network (SAN). In a cloud setting, you would typically rent a block device or virtual drive to store the persistent database data on.
-
Then you would need a different setup for the database safety outside of kubernetes right?tpaksu– tpaksu2019年05月07日 05:37:21 +00:00Commented May 7, 2019 at 5:37
-
@tpaksu what do you mean exactly? Kubernetes only manages resources, but does not provide "database safety" by itself. If you want persistent databases you must provide some persistent storage to Kubernetes to manage.amon– amon2019年05月07日 09:16:54 +00:00Commented May 7, 2019 at 9:16
-
I mean kubernetes has a disaster recovery by duplicating pods, if I put the DB resource outside kubernetes pods and then use reference to that resource files from the database engine, I'll need a replication mechanism or something else to secure the resource files. Am I right?tpaksu– tpaksu2019年05月07日 09:59:00 +00:00Commented May 7, 2019 at 9:59
-
@tpaksu if you’re running k8s in a cloud, like Azure or AWS, consider using one of the cloud provider’s database solutions outside of k8s and configuring your pod to connect to that. They’ll have options that make replicating the db for availability easy.RubberDuck– RubberDuck2019年05月07日 10:02:04 +00:00Commented May 7, 2019 at 10:02
-
1If you really want to do this in k8s, you need some
PersistentStorage
and aStatefulSet
. Those are the k8s terms you’ll need to search for @tpaksuRubberDuck– RubberDuck2019年05月07日 10:21:52 +00:00Commented May 7, 2019 at 10:21