카산드라는 0.7 버전부터인가?? YAML 포맷의 파일을 설정파일로 사용하기 시작을 했다..
기존에는 YAML 파일은 JSON 포맷과 다른 또 다른 포맷으로만 알고 있었는데, 위키의 YAML 포맷에 대한 정의를 살펴보면, 사람이 쉽게 읽을 수 있는 데이터 직렬화 양식이라고 한다.. 흠 그래.. 쪼매 쉽게 읽히긴 한다..
아래에서 0.6 버전에서 사용했던 XML 포맷과 현재(1.0.2)에서 사용하고 있는 YAML 포맷을 살펴보자.. 어떤 설정이 더 가독성이 있을까? 라는 질문에 나는 XML 포맷이라고 답하겠다..
단순히 포맷만 보면, XML이 YAML 포맷에 비해서 비 효율적이고 파싱에 대한 비용도 많이 들겠다.. 하지만, 카산드라에서 아래의 용도는 클러스터/머신의 상황에 맞게 카산드라가 잘 돌게 하기 위한 설정이다.. 설정이라고 하면, 포맷에 맞춰서 설정한 값의 가독성이 매우 중요한 포인트라고 생각한다. 그런 면에서 나는 꼭 설정에 XML을 사용할 필요는 없겠지만, 가급적이면 YAML보다 쉽고(비교적 쉽게 느껴지는 주관) 가독성이 좋은 XML을 사용해 줬으면 하지만, 지금은 YAML 포맷이니.. YAML 포맷을 숙지하고 사용해야 겠지만, 아쉬운 느낌이다.. 아.. XML 포맷이 좋아지다니... ㅋㅋㅋ
The primary difference between Cassandra and Hadoop is that Cassandra
targets real-time/operational data, while Hadoop has been designed for
batch-based analytic work.
There are many different technical differences between Cassandra and
Hadoop, including Cassandra’s underlying data structure (based on
Google’s Bigtable), its fault-tolerant, peer-to-peer architecture,
multi-data center capabilities, tunable data consistency, all nodes
being the same (no concept of a namenode, etc.) and much more.
How does Cassandra differ from HBase?
HBase is an open-source, column-oriented data store modeled after
Google Bigtable, and is designed to offer Bigtable-like capabilities on
top of data stored in Hadoop. However, while HBase shared the Bigtable
design with Cassandra, its foundational architecture is much different.
A Cassandra cluster is much easier to setup and configure than a
comparable HBase cluster. HBase’s reliance on the Hadoop namenode
equates to there being a single point of failure in HBase, whereas with
Cassandra, because all nodes are the same, there is no such issue.
In internal performance tests conducted at DataStax (using the Yahoo
Cloud Serving Benchmark – YCSB), Cassandra offered literally 5X better
performance in writes and 4X better performance on reads than HBase.
How does Cassandra differ from MongoDB?
MongoDB is a document-oriented database that is built upon a
master-slave/sharding architecture. MongoDB is designed to store/manage
collections of JSON-styled documents.
By contrast, Cassandra uses a peer-to-peer, write/read-anywhere
styled architecture that is based on a combination of Google BigTable
and Amazon Dynamo. This allows Cassandra to avoid the various
complications and pitfalls of master/slave and sharding architectures.
Moreover, Cassandra offers linear performance increases as new nodes are
added to a cluster, scales to terabyte-petabyte data volumes, and has
no single point of failure.
카산드라(Cassandra)를 사용하거나 혹은 사용하기를 고려하는 분이라면 읽어보면 좋을 2개의 파일이 있는데, 아래의 파일들은 카산드라의 압축을 풀면 최상단에 위치하는 파일들입니다.
한개는 README.txt 이고, 다른 한개는 NEWS.txt 입니다..
README.txt는 카산드라에 대한 개략적인 내용과 설치에 대한 내용이 들어 있습니다..
NEWS.txt는 카산드라가 버전업이 되면서, 바뀌는 내용을 개략적으로 기술한 파일입니다..
그리고, 좀 더 자세히 살펴보실 분들은 NOTICE.txt와 CHANGES.txt 을 검토해 보길 권해 드립니다..
NOTICE.txt 파일에는 카산드라가 사용하고 있는 Dependency Library에 대한 내용을 포함하고 있습니다..
CHANGES.txt 파일에는 카산드라 버전이 업데이트 되면서, 새롭게 추가되거나 수정된 내용을 이슈트래킹 번호를 포함해서 보여주고 있습니다.. 따라서 자세한 내용은 CHANGES.txt 파일의 이슈트래킹 번호를 따라 들어가게 되면, 버전 업에 따른 좀 더 세부적인 내용을 살펴볼 수 있습니다..
이상, 카산드라를 무작정 사용하는 것도 좋겠지만, 한번 읽어보면 좋을 만한 카산드라 파일에 대한 내용이었습니다.. ^^
public void commit() throws SQLException {
if (!connection.getAutoCommit()) {
connection.commit();
}
}
public void close() throws SQLException {
resetAutoCommit();
connection.close();
}
protected void resetAutoCommit() {
try {
if (!connection.getAutoCommit()) {
// for compatibility we always use true, as some drivers don't like being left in "false" mode.
connection.setAutoCommit(true);
}
} catch (SQLException e) {
// Only a very poorly implemented driver would fail here,
// and there's not much we can do about that.
throw new TransactionException("Error configuring AutoCommit. " +
"Your driver may not support getAutoCommit() or setAutoCommit(). Cause: " + e, e);
}
}
흠, 따라서 commit();close();와 close();가 동일할 거라는 심증만 가지게 되네요..^^;;
commit();close();해야 하나요?? close();만 하면 되까요?? ㅋㅋ
Cassandra.Client는 org.apache.cassandra.thrift 패키지의 클래스이고, 결국, Client는 Cassandra의 Inner 클래스가 됩니다..그리고, 위 패키지는 Thrift라는 데이타 serialize/deserialize 라이브러리(Google Protocol Buffers랑 비슷)를 통해서 전송될 데이타를 만들고, 전송된 데이타를 처리하고 있겠죵.. Cassandra의 언어별 클라이언트 라이브러리는 http://wiki.apache.org/cassandra/ClientOptions 페이지에 자세히 기술이 되어 있습니다.. 클라이언트 라이브러리들은 보통 Cassandra 서버에 붙는 Connection에 대한 풀링을 제공하는데, Connection 풀링은 org.apache.cassandra.thrift 패키지의 Cassandra.Client를 풀링해서 구현 할 수 있습니다.
아래 코드는 Cassandra.Client의 풀링을 통한 Cassandra 서버의 연결에 대한 풀링기능을 제공하고 있습니다..^^
Cassandra의 경우, Cassandra의 서버 설정에서 <ReplicationFactor>1</ReplicationFactor>를 통해서 데이타 복제에 대한 설정을 할 수 잇습니다. 클라이언트에서는 데이타를 쓰고, 읽는데 ConsistencyLevel을 통해서 Async 또는 Sync(1개만 되면 리턴, 설정된 개수만큼 저장이 되야 리턴)등의 설정 파라미터를 통해서 정책적으로 성격에 맞게 사용할 수 있습니다. 그래서, ConsistencyLevel은 잘 알고 있어야 할거 같습니다.
0.6.2버전에서의 ConsitencyLevel에 대한 주석내용은 아래와 같습니다.
/**
* The ConsistencyLevel is an enum that controls both read and write behavior based on <ReplicationFactor> in your
* storage-conf.xml. The different consistency levels have different meanings, depending on if you're doing a write or read
* operation. Note that if W + R > ReplicationFactor, where W is the number of nodes to block for on write, and R
* the number to block for on reads, you will have strongly consistent behavior; that is, readers will always see the most
* recent write. Of these, the most interesting is to do QUORUM reads and writes, which gives you consistency while still
* allowing availability in the face of node failures up to half of <ReplicationFactor>. Of course if latency is more
* important than consistency then you can use lower values for either or both.
*
* Write:
* ZERO Ensure nothing. A write happens asynchronously in background
* ANY Ensure that the write has been written once somewhere, including possibly being hinted in a non-target node.
* ONE Ensure that the write has been written to at least 1 node's commit log and memory table before responding to the client.
* QUORUM Ensure that the write has been written to <ReplicationFactor> / 2 + 1 nodes before responding to the client.
* ALL Ensure that the write is written to <code><ReplicationFactor></code> nodes before responding to the client.
*
* Read:
* ZERO Not supported, because it doesn't make sense.
* ANY Not supported. You probably want ONE instead.
* ONE Will return the record returned by the first node to respond. A consistency check is always done in a
* background thread to fix any consistency issues when ConsistencyLevel.ONE is used. This means subsequent
* calls will have correct data even if the initial read gets an older value. (This is called 'read repair'.)
* QUORUM Will query all storage nodes and return the record with the most recent timestamp once it has at least a
* majority of replicas reported. Again, the remaining replicas will be checked in the background.
* ALL Not yet supported, but we plan to eventually.
*/
클라이언트에서 데이타를 쓰고, 읽기를 진행할때, 위의 주석에 대한 숙지가 꼭 필요할 것 같습니다. ^^
Cassandra.Client는 http://incubator.apache.org/thrift/ 를 이용해서 Cassandra에 데이타를 insert하고 select하는 inner 클래스 입니다. Cassandra.Client를 생성하고 사용하기 위해서 제가 사용하는 CassandraClientFactory 클래스는 아래와 같습니다.
HTTP 프로토콜을 이용한 요청/응답을 쉽게 가능하게 도와주는 라이브러리로 Apache HttpClient를 사용해 봤습니다. 흠.. UTF-8등은 문제가 되지 않지만, EUC-KR로 인코딩된 페이지는 한글이 깨져서 나옵니다.
위 문제를 해결하기 위해서는 받은 데이타를 맞는 포맷으로 인코딩해 주시면 됩니다.
아래 코드처럼, 받은 데이타를 String 변수인 x에 저장하고, 다시 x를 아래의 포맷(iso-8859-1)으로 바꿔서 String y에 저장을 하고 뿌려주면 아래그림처럼 잘 나오게 됩니다.
In the following section, we are going to compare the various features between the two frameworks. Struts 2.0 is very simple as compared to struts 1.0,1.1, few of its excelent features are:
1.Servlet Dependency
Actions in Struts1 have dependencies on the servlet API since the
HttpServletRequest and HttpServletResponse objects are passed to the
execute method when an Action is invoked but in case of Struts 2.0, Actions are not container dependent because they are made simple POJOs. In Struts 2.0, the servlet contexts are represented as simple Maps which allows actions to be tested in isolation. Struts 2.0
Actions can access the original request and response, if required.
However, other architectural elements reduce or eliminate the need to
access the HttpServetRequest or HttpServletResponse directly.
2.Action classes
Programming the abstract classes instead of interfaces is one of
design issues of struts 1.0 framework that has been resolved in the Struts 2.0
framework. Struts 1.0 Action classes needs to extend framework
dependent abstract base class. But in case of Struts 2.0 Action class
may or may not implement interfaces to enable optional and custom
services. In case of Struts 2.0 , Actions are not container dependent because they are made simple POJOs. Struts 2.0
provides a base ActionSupport class to implement commonly used
interfaces. Albeit, the Action interface is not required. Any POJO
object with an execute signature can be used as an Struts 2.0 Action object.
3.Validation
Struts 1.0 and Struts 2.0 both supports the manual
validation via a validate method.
Struts 1.0 uses validate method on the ActionForm, or validates through
an extension to the Commons Validator. However, Struts 2.0 supports
manual validation via the validate method and the XWork Validation
framework. The Xwork Validation Framework supports chaining validation
into sub-properties using the validations defined for the properties
class type and the validation context.
4.Threading Model
In Struts1, Action resources must be thread-safe or synchronized. So
Actions are singletons and thread-safe, there should only be one
instance of a class to handle all requests for that Action. The
singleton strategy places restrictions on what can be done with Struts
1.0 Actions and requires extra care to develop. However in case of Struts 2.0,
Action objects are instantiated for each request, so there are no
thread-safety issues. (In practice, servlet containers generate many
throw-away objects per request, and one more object does not impose a
performance penalty or impact garbage collection.)
5.Testability
Testing Struts 1.0 applications are a bit complex. A major hurdle to
test Struts 1.0 Actions is that the execute method because it exposes
the Servlet API. A third-party extension, Struts TestCase, offers a set
of mock object for Struts1. But the Struts 2.0
Actions can be tested by instantiating the Action, setting properties
and invoking methods. Dependency Injection support also makes testing
simpler. Actions in struts2 are simple POJOs and are framework
independent, hence testability is quite easy in struts2.
6.Harvesting Input
Struts 1.0 uses an ActionForm object to capture input. And all
ActionForms needs to extend a framework dependent base class. JavaBeans
cannot be used as ActionForms, so the developers have to create
redundant classes to capture input. However Struts 2.0 uses Action properties (as input
properties independent of underlying framework) that eliminates the
need for a second input object, hence reduces redundancy. Additionally
in Struts 2.0, Action properties can be accessed from the web page via the taglibs. Struts 2.0
also supports the ActionForm pattern, as well as POJO form objects and
POJO Actions. Even rich object types, including business or domain
objects, can be used as input/output objects.
7.Expression Language
Struts 1.0 integrates with JSTL, so it uses the JSTL-EL. The struts1 EL has basic object graph traversal, but relatively weak collection and indexed property support. Struts 2.0
can also use JSTL, however it supports a more powerful and flexible
expression language called "Object Graph Notation Language" (OGNL).
8.Binding values into views
In the view section, Struts1 uses the standard JSP mechanism to bind
objects (processed from the model section) into the page context to
access. However Struts 2.0
uses a "ValueStack" technology so that the taglibs can access values
without coupling your view to the object type it is rendering. The
ValueStack strategy allows the reuse of views across a range of types
which may have the same property name but different property types.
9.Type Conversion
Usually, Struts 1.0 ActionForm properties are all Strings. Struts
1.0 uses Commons-Beanutils for type conversion. These type converters
are per-class and not configurable per instance. However Struts 2.0 uses OGNL for type conversion. The framework includes converters for basic and common object types and primitives.
10.Control Of Action Execution
Struts 1.0 supports separate Request Processor (lifecycles) for each
module, but all the Actions in a module must share the same lifecycle.
However Struts 2.0
supports creating different lifecycles on a per Action basis via
Interceptor Stacks. Custom stacks can be created and used with
different Actions as needed.
1. 어플리케이션 서버에서 필요한 메모리 계산 방법 - 계산식 : (MaxProcessMemory - JVMMemory - ReservedOsMemory) / (ThreadStackSize) = Number of threads - 메모리 계산 예 가정 : Java 1.5를 사용중이며 OS가 120MB를, 디폴트 스택사이즈는 0.5M
JVM에 1.5GB할당되었을 경우 : (2GB-1.5Gb-120MB)/(1MB) = ~380 threads
JVM에 1.0GB할당되었을 경우 : (2GB-1.0Gb-120MB)/(1MB) = ~880 threads
통계적으로 대략 200명의 동시 사용자 수용할 경우 300MB정도 필요하합니다. 이것을 고려해서 메모리를 계산하면 됩니다.
2. Application Server 에러 대처 방안(java.lang.OutOfMemoryError: PermGen space 현상)
JHat으로 메모리릭 원인을 찾고 JConsole, Lambda probe 등을 통해 메모리 모니터링을 함
Application Server운영자는 Garbage Collection에 대한 이해가 있어야 함
3. Tomcat에서 설정 예시
힙메모리 정보를 출력 : -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC
위 설정을 통해 출력되는 로그를 보고 New Generation의 eden 영역, Old Generation 영역, Permanent 영역을 확인하여 각 영역이 작으면 아래와 같은 설정으로 적당 사이즈를 확보해 줍니다.
도출된 설정 : -Xms256m -Xmx512m -XX:NewSize=256m -XX:MaxNewSize=256m -XX:MaxPermSize=128m -XX:SurvivorRatio=5 -Xms : 최소 힙 싸이즈 -Xmx : 최대 힙 싸이즈 -XX:NewSize : New Generation의 최소 싸이즈 -XX:MaxNewSize : New Generation의 최대 싸이즈 -XX:MaxPermSize : Permanent Generation의 최대 싸이즈 가 되겠다. -XX:SurvivorRatio : 영역비율(New Generation)
Bootstrap classes of your JVM System class loader classses (described above) /WEB-INF/classes of your web application /WEB-INF/lib/*.jar of your web application $CATALINA_HOME/common/classes $CATALINA_HOME/common/endorsed/*.jar $CATALINA_HOME/common/i18n/*.jar $CATALINA_HOME/common/lib/*.jar $CATALINA_BASE/shared/classes $CATALINA_BASE/shared/lib/*.jar
Tomcat5.0 클래스로딩 순서
Bootstrap classes of your JVM System class loader classses (described above) /WEB-INF/classes of your web application /WEB-INF/lib/*.jar of your web application $CATALINA_HOME/common/classes $CATALINA_HOME/common/endorsed/*.jar $CATALINA_HOME/common/lib/*.jar $CATALINA_BASE/shared/classes $CATALINA_BASE/shared/lib/*.jar
Tomcat4.1 클래스로딩 순서
/WEB-INF/classes of your web application /WEB-INF/lib/*.jar of your web application Bootstrap classes of your JVM System class loader classses (described above) $CATALINA_HOME/common/classes $CATALINA_HOME/common/endorsed/*.jar $CATALINA_HOME/common/lib/*.jar $CATALINA_BASE/shared/classes $CATALINA_BASE/shared/lib/*.jar