This is a follow on to : Efficiently detect datacenter based on server hostname and return the full path.
I have a library which is running in different datacenters in production. I have three datacenters in production:
DHP, SLP, LTR
Depending on which datacenter code is running, I need to make a LinkedList (allPaths
) which will have local datacenter path and then followed by remote datacenter path. As an example, if my code is running in DHP datacenter, then allPaths
linked list will have. First element will always be LOCAL datacenter where the code is running and remaining can be other datacenter as per enum.
/tr/datacenter/dhp
/tr/datacenter/slp
/tr/datacenter/ltr
Our machine name is like this in production and it is also possible that machine hostname can have more dots in between separated by another domain in future.
dbx111.dhp.host.com (machine in dhp datacenter)
dbx112.slp.host.com (machine in slp datacenter)
dcx113.ltr.host.com (machine in ltr datacenter)
And our machine name in DEV
(our dev environment) is like this and this is DEV datacenter.
z-wejnv-0432.dev.host.com
Below are my use case:
For code running in production case and if the DataFlowEnum
is PARTIAL
:
- If it is running in
DHP
datacenter, thenLOCAL_PATH
should be/tr/datacenter/dhp
andREMOTE_PATH
SET will be/tr/datacenter/slp
and/tr/datacenter/ltr
. - Similarly If it is running in
SLP
datacenter, thenLOCAL_PATH
should be/tr/datacenter/slp
andREMOTE_PATH
will be/tr/datacenter/dhp
and/tr/datacenter/ltr
. - Similarly If it is running in
LTR
datacenter, thenLOCAL_PATH
should be/tr/datacenter/ltr
andREMOTE_PATH
will be/tr/datacenter/dhp
and/tr/datacenter/slp
.
For code running in production case and if the DataFlowEnum
is TEMP
everything is same as above - Only difference is instead of tr
, I will use trpp
and all the logic is same.
If the code is not running in production or it is running in DEV
datacenter, then LOCAL_PATH
and REMOTE_PATH
should always be /tr/datacenter/dhp
if DataFlowType
is PARTIAL
or /trpp/datacenter/dhp
if DataFlowType
is TEMP
.
Below is my DataCenterEnum
:
public enum DatacenterEnum {
DHP("/datacenter/dhp"), SLP("/datacenter/slp"), LTR("/datacenter/ltr");
private static final Random random = new Random();
private static final DatacenterEnum[] VALUES = values();
private static final int SIZE = VALUES.length;
private static final DatacenterEnum ourLocation = findLocation();
private static final String LOCAL_PATH = ourLocation.findLocalPath();
private static final Set<String> REMOTE_PATH = ourLocation.findRemotePath();
private String value;
private DatacenterEnum(String value) {
this.value = value;
}
public String value() {
return value;
}
public static String forCode(int code) {
return (code >= 0 && code < SIZE) ? VALUES[code].name() : null;
}
private static DatacenterEnum findLocation() {
Optional<String> ourhost = getHostNameOfServer();
if (ourhost.isPresent()) {
if (isDevHost(ourhost.get())) {
return DHP;
}
for (DatacenterEnum dc : VALUES) {
String namepart = "." + dc.name().toLowerCase() + ".";
if (ourhost.get().indexOf(namepart) >= 0) {
return dc;
}
}
// this means the call is coming from some other datacenter apart from DHP, SLP and LTR,
// so we will randomly select DHP, SLP or LTR
return DataUtils.isProduction() ? VALUES[random.nextInt(SIZE)] : DHP;
}
// if it comes here then it means somehow, we failed to find the hostname.
// so we will randomly select DHP, SLP or LTR
return DataUtils.isProduction() ? VALUES[random.nextInt(SIZE)] : DHP;
}
private String findLocalPath() {
String path = DatacenterEnum.DHP.value();
if (DataUtils.isProduction()) {
path = ourLocation.value();
}
return path;
}
private Set<String> findRemotePath() {
Set<String> remotePath = new HashSet<String>();
if (DataUtils.isProduction()) {
// contains all DatacenterEnum except ourLocation
Set<DatacenterEnum> remoteSet = EnumSet.complementOf(EnumSet.of(ourLocation));
for (DatacenterEnum dc : remoteSet) {
remotePath.add(dc.value());
}
} else {
remotePath.add(DatacenterEnum.DHP.value());
}
return remotePath;
}
public static Set<String> getAllPaths(DataFlowEnum dataType) {
Set<String> allPaths = new LinkedHashSet<String>();
String prefix = dataType.equals(DataFlowEnum.PARTIAL) ? DataFlowEnum.PARTIAL.value() : DataFlowEnum.TEMP.value();
allPaths.add(prefix + LOCAL_PATH);
for (String path : REMOTE_PATH) {
allPaths.add(prefix + path);
}
return allPaths;
}
private static final Optional<String> getHostNameOfServer() {
try {
return Optional.of(InetAddress.getLocalHost().getCanonicalHostName().toLowerCase());
} catch (UnknownHostException ex) {
// logging error
return Optional.absent();
}
}
private static boolean isDevHost(String hostName) {
return hostName.indexOf(".dev.") >= 0;
}
public static DatacenterEnum getCurrentDatacenter() {
return ourLocation;
}
}
And below is my DataFlowEnum
which is also being used by other classes:
public enum DataFlowEnum {
PARTIAL("/tr"), TEMP("/trpp");
private String value;
private DataFlowEnum(String value) {
this.value = value;
}
public String value() {
return value;
}
}
This is the way I am calling from my main application code to get all the paths:
Set<String> allPaths = DatacenterEnum.getAllPaths(key.Type());
Does my above code looks right or are there any chances of improvement in terms of efficiency and performance? This calculation should be only one time when the code is called for the first time instead of we are doing same thing for every call. My above code is generic, let's say if I want to add another datacenter, then I just need to add another datacenter in my above ENUM and it will work fine.
I also thought about doing through properties file as suggested by rolfl in my previous question, but in that case we still need to release a new version of client within our company for other customers so I am planning to go with this ENUM solution which is also generic, for any new datacenter, we can just add it and it will work fine.
1 Answer 1
Slight oddities
Your
enum
types do not need anEnum
suffix in the name, calling themDatacenter
orDataFlow
should be clear enough. :)I think your
VALUES
andSIZE
are redundant, as you can callvalues()
andvalues().length
directly. Arguably, specifyingSIZE
is much shorter, but you'll have to use it often enough to 'see' the difference, and given the current code it's better off as a direct reference.Your
static final
variables do not have a consistent casing, they should beUPPERCASE
, thereforerandom => RANDOM
andourLocation => OUR_LOCATION
. On the latter, maybe calling itLOCALHOST
is a plausible option, since it references the host name of the current server?How is
forCode()
used by the other classes? It's odd that you have a method that specifically expects callers to know the ordering of yourenum
values in order to retrieve the desired one... what if someone changesLTR
to beforeSLP
to order them alphabetically?With the exception of the comments in
findLocation()
, I feel that the other comments are redundant as they explained the how and not the why. The link tells you why (pun unintended) it should be the other way round. :) Even forfindLocation()
, you can probably simplify the comment to explain why the random selection suffices - is this a prior technical agreement? Recommended practice? Because the Benevolent(削除) Dictator (削除ここまで)Architect said so?Do
isDevHost(String)
andDataUtils.isProduction()
agree with, i.e. complements, each other? If so, there's a small simplification which I'll explain below.Is the distinction for
LOCAL_PATH
andREMOTE_PATH
purely for determining the arrangement in theSet
when callinggetAllPaths(DataFlowEnum)
? If so, you can probably just determine the arrangement once upon initialization into aList
so that you don't even need to make themstatic final
references... will illustrate this below.getCurrentDatacenter()
can be dropped too, by makingOUR_LOCATION
public
.
Code Simplication
indexOf(CharSequence) >= 0
can be replaced bycontains(CharSequence)
.You can probably simplify your constructor as such, assuming the paths and naming conventions don't change:
private Datacenter() { this.value = "/datacenter/" + name().toLowerCase(); }
The
String
s you use in yourenum
values' declaration are not required. You should also consider adding thefinal
keyword tovalue
too, to make it clear that it cannot be re-assigned.findLocation()
First, since you will always randomize a server or use
DHP
whenDataUtils.isProduction() == false
, you don't need the intermediatereturn
statement after yourfor
loop, retaining just the final one. Second, ifisDevHost(String)
complementsDataUtils.isProduction()
, then you don't even need the former (naturally), and you can also neatly skip that nestedif
. Combining both will result in the following simplification:private static Datacenter findLocation() { Optional<String> host = getHostNameOfServer(); if (host.isPresent() && DataUtils.isProduction()) { for (Datacenter dc : values()) { if (host.get().contains("." + dc.name().toLowerCase() + ".")) { return dc; } } } return DataUtils.isProduction() ? values()[RANDOM.nextInt(values().length)] : DHP; }
One more thing: you can also consider making
"." + dc.name().toLowerCase() + "."
as yourenum
'stoString()
representation.findLocalPath()
This can become the following one-liner:
return (DataUtils.isProduction() ? OUR_LOCATION : DHP).value();
findRemotePath()
(which should really befindRemotePaths()
)To reduce one-level of nesting, you can consider testing for
DataUtils.isProduction()
first as such:private Set<String> findRemotePath() { if (!DataUtils.isProduction()) { return Collections.singleton(DHP.value()); } Set<String> remotePaths = new HashSet<String>(); for (Datacenter dc : EnumSet.complementOf(EnumSet.of(OUR_LOCATION))) { remotePaths.add(dc.value()); } return remotePaths; }
getAllPaths(DataFlowEnum)
dataType.equals(DataFlowEnum.PARTIAL)
can simply bedataType == DataFlowEnum.PARTIAL
. Actually, I'm not sure why you need the comparison, since you can just use it directly. You probably need anull
-check here though, to avoidNullPointerException
s. Effectively, you can just assignString prefix = dataType.value();
.As mentioned above, you probably can consider determining the arrangement once upon initialization into a
List
to simplify the code:private static List<Datacenter> getOrdered() { // create a List by adding findLocation() first, then the other two // e.g. reuse EnumSet.complementOf(EnumSet.of(OUR_LOCATION)) here } private static final List<Datacenter> ORDERED = getOrdered(); public static Set<String> getAllPaths(DataFlow dataType) { // null check first Set<String> allPaths = new LinkedHashSet<String>(); for (Datacenter current : ORDERED) { allPaths.add(dataType.value() + current.value()); } return allPaths; }
DataFlowEnum
One small advice: add the
final
keyword tovalue
to indicate that it cannot be re-assigned.Diamond (sign) for generics type inference
Just a general comment, if you are on Java 7, you can use
<>
for generics type inference.
edit As requested, this is what a possible implementation for getOrdered()
can look like:
private static List<Datacenter> getOrdered() {
if (!DataUtils.isProduction()) {
// if returning just DHP makes sense for non-prod environment
// return Arrays.asList(DHP);
// else replicate existing implementation
return Arrays.asList(DHP, DHP);
}
EnumSet<Datacenter> local = EnumSet.of(getCurrentProdLocation());
List<Datacenter> result = new ArrayList<Datacenter>(local);
for (Datacenter dc : EnumSet.complementOf(local)) {
result.add(dc);
}
return result;
}
I have that chunk of comment at the start to highlight one thing: do you intentionally want your non-prod environment to have the same DHP
value for both local and remote paths? Regardless whether that is a misunderstanding of the actual requirements, you can do the check !DataUtils.isProduction()
and return the appropriate results for the non-production scenario first. As a result, you can skip the same check in findLocation()
too (renaming it as getCurrentProdLocation()
):
private static Datacenter getCurrentProdLocation() {
Optional<String> host = getHostNameOfServer();
if (host.isPresent()) {
for (Datacenter dc : values()) {
// assuming toString() representation is "." + dc.name().toLowerCase() + "."
if (host.get().contains(dc.toString())) {
return dc;
}
}
}
return values()[RANDOM.nextInt(values().length)];
}
-
\$\begingroup\$ Awesome code review. Thanks for the help. I was reading your point 6 in Code Simplification - Why you want to have
ORDERED
List? Can you fill that method as well as I am confuse how would that method will be used ingetAllPaths
then? Does that mean I will not usefindRemotePath()
method \$\endgroup\$david– david2015年05月17日 17:45:15 +00:00Commented May 17, 2015 at 17:45 -
\$\begingroup\$ @david the code for it will be close to putting both
findLocation()
andfindRemotePath()
, just that it returns aList
in the ordering that you require. Yes, you will not require both the methods, and as well asLOCAL_PATH
andREMOTE_PATH
too. This is assuming you definitely do not need to expose them with thepublic
access modifier down the road... \$\endgroup\$h.j.k.– h.j.k.2015年05月18日 01:21:22 +00:00Commented May 18, 2015 at 1:21 -
\$\begingroup\$ I see, can you fill that method as well just to make sure I understood this right? And then how ORDERED will be use then? getAllPaths method will change right? \$\endgroup\$david– david2015年05月18日 01:43:31 +00:00Commented May 18, 2015 at 1:43
-
\$\begingroup\$ @david please see my update... yes,
getAllPaths()
implementation will change. \$\endgroup\$h.j.k.– h.j.k.2015年05月18日 03:30:42 +00:00Commented May 18, 2015 at 3:30